On Git practises - Githubissues

thomaseizinger commented 1 month ago

I wanted to get a discussion started on how Git is used in str0m. Currently, there doesn't seem to be a consistent handling of how main is updated:

Sometimes, commits are pushed directly to main
Sometimes, PRs are merged using merge-commits
Sometimes, PRs are merged with forward-merges

We are maintaining a fork of str0m that diverges in several ways from upstream. For example, it has changes that might have already open PRs, or changes that are still considered controversial. It also has changes that we had to revert (like #508) because they caused problems for us.

Maintaining this fork (especially with reverts) is difficult without a consistent handling of updates to main. I'd like to ask if we can move to:

Always make changes to main via PRs
Always merge those PRs via merge-commits or squash-merging

Personally, I find squash-merging cleaner because it results in a nicer history on main and makes it really easy to revert things. I do understand that not everybody is a fan of squash-merging so I am also fine with merge-commits.

Essentially, what I am asking is to avoid direct pushes to main and forward-merges of PRs.

algesten commented 1 month ago

Ok, I enabled this rule, because I don't like merge commits. Though I don't think that's what you're talking about.

Always make changes to main via PRs

I'm open to a discussion on how I do this.

From my perspective, maintaining an open source project involves a certain amount of janitor tasks, like keeping on top of documentation, maintaining the change log etc. In my experience from elsewhere (ureq), it's difficult to perform this role with a strict "everything must be a PR" rule – if you're unlucky, when about to do a release, no one is around to approve something silly like a version bump in Cargo.toml, and a task that should take minutes, ends up taking days due to focus switch, timezones etc.

This project has more collaborators, so I'm ok with trying. However as a counter question, I don't understand how maintaining a fork becomes easier or harder by me doing these janitor commits, since I'm obviously not rewriting git history/force pushing.

Always merge those PRs via merge-commits or squash-merging

My goal in any project is that the final git log should make sense – the linux git log being the goal. However as a maintainer this is hard to enforce. PR reviews are typically not made commit-by-commit, but rather the entire diff – which means it's an extra step to try to convince people to fix their commit history (I think many don't even know how to do it). I also find it unworkable to ensure every PR is about exactly one thing. It's the ambition of course, but "drive by fixes" to code you're working in, but isn't exactly your task, is an essential part in code hygiene. Though such fixes should of course be discrete commits. That in itself though means squash-commits as a rule is not necessarily right.

It sounds like you're advocating for merge or squash, and against rebase?

On a philosophical level, I believe in git, not github. Which means for me, PRs are a secondary concern, but I'm balancing that against the reality I'm in.

thomaseizinger commented 1 month ago

However as a counter question, I don't understand how maintaining a fork becomes easier or harder by me doing these janitor commits, since I'm obviously not rewriting git history/force pushing.

I discovered that I can do git revert --no-commit with a ref-range which made it easy to create a commit in our fork that essentially reverts an entire PR, despite the lack of a merge commit.

From my perspective, maintaining an open source project involves a certain amount of janitor tasks, like keeping on top of documentation, maintaining the change log etc. In my experience from elsewhere (ureq), it's difficult to perform this role with a strict "everything must be a PR" rule – if you're unlucky, when about to do a release, no one is around to approve something silly like a version bump in Cargo.toml, and a task that should take minutes, ends up taking days due to focus switch, timezones etc.

I can very much relate. I used to be a full-time maintainer of https://github.com/libp2p/rust-libp2p with the other maintainer being in a completely different timezone. We worked around things like these with automation (being able to get an approval via mergify by setting a label). However, these things also accumulate and I understand the desire to keep things as lean as possible.

Always merge those PRs via merge-commits or squash-merging

My goal in any project is that the final git log should make sense – the linux git log being the goal. However as a maintainer this is hard to enforce. PR reviews are typically not made commit-by-commit, but rather the entire diff – which means it's an extra step to try to convince people to fix their commit history (I think many don't even know how to do it). I also find it unworkable to ensure every PR is about exactly one thing. It's the ambition of course, but "drive by fixes" to code you're working in, but isn't exactly your task, is an essential part in code hygiene. Though such fixes should of course be discrete commits. That in itself though means squash-commits as a rule is not necessarily right.

I support this goal and I think we can achieve it fairly easily:

Squash merge PRs by default. IMO, an atomic commit on main is better even if it means that drive-by fixes are squashed in. We can still allow forward-merges for PRs with an (exceptionally) clean history. We can judge that on a case-by-case basis.
You can enable branch protection to always require PRs to make changes and also enable for administrators to override this. This will allow you to still push to main. It will also allow you to force-merge a PR despite failing checks for example.
GitHub actually has a setting to make the PR title + description the final commit message for squash-merged PRs. This makes it quite easy for us (thank you for the collab invite btw!) to edit a PR such that it ends up being a nice commit message on main.
As a (contribution) policy, I think it is good to have main always in a releasable state. Things like changelog updates should IMO happen as part of the PR.

k0nserv commented 1 month ago

I think squash merges are sometimes to heavy-handed. It's common to have several atomic changes in a PR where that atomicity should be retained after merging. For example, if one is implementing something that requires refactoring existing code that should be two commits: one for the refactor and another for the implementation, those shouldn't then be squashed when merging.

thomaseizinger commented 1 month ago

I think squash merges are sometimes to heavy-handed. It's common to have several atomic changes in a PR where that atomicity should be retained after merging. For example, if one is implementing something that requires refactoring existing code that should be two commits: one for the refactor and another for the implementation, those shouldn't then be squashed when merging.

Yeah, if it is clearly separated like that, a forward merge is fine :)

The main thing that I am looking to do is being able to manage our fork easily. For example, I'd like to be able to revert an individual feature / change because it is undesired (or breaks something for us). So, as long as the commits in a PR are truly atomic, a forward merge is fine.

In my experience, checking that with git rebase main -x can actually be more tedious than just opening another PR with the refactoring and relying on CI. It is also easier to check for the reviewers because PRs is how GitHub works.

algesten commented 1 month ago

Squash merge PRs by default. IMO, an atomic commit on main is better even if it means that drive-by fixes are squashed in. We can still allow forward-merges for PRs with an (exceptionally) clean history. We can judge that on a case-by-case basis.

Ok. I will start to look out for this in PRs. Check whether the history is sane or more work-in-progress style. I will squash if I spot WIP.

You can enable branch protection to always require PRs to make changes and also enable for administrators to override this. This will allow you to still push to main. It will also allow you to force-merge a PR despite failing checks for example.

Sure.

GitHub actually has a setting to make the PR title + description the final commit message for squash-merged PRs. This makes it quite easy for us (thank you for the collab invite btw!) to edit a PR such that it ends up being a nice commit message on main.

Alright I'll use that when the description is good and doing a squash.

As a (contribution) policy, I think it is good to have main always in a releasable state. Things like changelog updates should IMO happen as part of the PR.

Yeah. That's tricky. I personally dislike when opensource projects add too many hoops you have to jump through when contributing (sign this NDA, fill out this form, ping this bot, update this file etc). I also prefer to write the changlog myself deciding what to bring up and not.

What if I endeavor to make my janitor updates very soon after merging something?

thomaseizinger commented 1 month ago

As a (contribution) policy, I think it is good to have main always in a releasable state. Things like changelog updates should IMO happen as part of the PR.

Yeah. That's tricky. I personally dislike when opensource projects add too many hoops you have to jump through when contributing (sign this NDA, fill out this form, ping this bot, update this file etc). I also prefer to write the changlog myself deciding what to bring up and not.

I agree! I think stuff like changelog updates isn't much of a problem.

IMO, the main thing we should strive for is that every commit on main at the minimum builds. It would also be nice if all tests would always pass but that is difficult if we want to forward-merge certain PRs. Personally, I often write and commit a failing test first. Would we consider that an okay-atomic commit?

algesten commented 1 month ago

It would also be nice if all tests would always pass but that is difficult if we want to forward-merge certain PRs. Personally, I often write and commit a failing test first. Would we consider that an okay-atomic commit?

That sounds like a nice idea!

thomaseizinger commented 1 month ago

It would also be nice if all tests would always pass but that is difficult if we want to forward-merge certain PRs. Personally, I often write and commit a failing test first. Would we consider that an okay-atomic commit?

That sounds like a nice idea!

Which part? Are we saying each commit must pass tests? :)

algesten commented 1 month ago

Yes. That's what I mean. 🙃

thomaseizinger commented 1 month ago

Great! I'll close this then as resolved :)

algesten / str0m

On Git practises #516