OpenSlides / openslides-meta

MIT License
0 stars 12 forks source link

Discuss merge strategy #24

Closed jsangmeister closed 8 months ago

jsangmeister commented 9 months ago

Some background from what I've gathered so far: When creating a pull request from a fork, GitHub automatically creates a hidden branch in the base repository which mirrors the pull request branch on the fork. This means that if the commit from the fork is checked out in some submodule, this will work out of the box, the the same commit hash also exists in the OpenSlides meta repository.

However, when the pull request is merged via "Squash and merge", the commit in the main branch will get a new commit hash. Therefore, as soon as the pull request is closed (or the branch on the fork is deleted, not sure what exactly the trigger is), the original commit hash does no longer exist in the OpenSlides repository. All pull requests pointing to it would need be updated, which quickly becomes tedious.

We have multiple options how to deal with this:

1) Just live with it.

The problem with this is that after the CI ran successfully once, e.g., in a backend PR with a new meta repo version, we currently have no way of getting notified if the checked-out commit still exists, meaning one can just merge a PR with an invalid meta commit hash. This would probably lead to some errors in the main branches, which is not what we want.

2) Switch to "Rebase and merge".

The clear advantage is that commit hash stay the same when the commit is merged into main, therefore all external PRs, e.g., in the backend stay valid. The disadvantage is that we no longer have 1 commit = 1 PR with a link to the PR. We could think about enforcing the "1 commit" rule via the CI, but this still does not automatically link the PR in the commit (although that is probably not as important) and this still leads to unnecessary work, because one would have to do the squashing manually. The only advantage would be that the developer is (hopefully) automatically reminded to update their external pull requests, but even that is not guaranteed.

Another option would be to EXCLUSIVELY push new commits to PRs via git commit --amend to ALWAYS have one commit on each PR. I assume this is hard to enforce, but has the advantage that when you update the files in the meta repo, you probably automatically also update the external PR, as the changes were necessary there.

To summarize, I would favor the "Rebase and merge" approach while dropping the "1 PR = 1 commit" rule. Simultanoiusly, we could try to follow our guidelines as meticulously as possible for commits in PRs for the meta repo and use git commit --amend when applicable to at least have "1 commit = 1 task".

rrenkert commented 9 months ago

I would also prefer the "Rebase and merge" strategy. Does the scenario only come up if the models.yml changed? If yes, we can use the git commit --amend only for changes in those commits changing the models.yml, but I'm not a fan of this command.

peb-adr commented 9 months ago

AFAIK rebase will also create new commits just that the individual diffs are preserved (as opposed to squashing). Since parent pointers change and the rebased commits are also newly created at a later time the hashes will change.

So even though I don't think I thoroughly understood the problem outlined above, I also don't think rebasing PRs is the solution. To me "live with it" and avoid checking out open PRs sounds like the way to go.

jsangmeister commented 9 months ago

Oof, that's bad... Seems like a big miss of a feature from GitHub's side. Please all upvote https://github.com/orgs/community/discussions/4618 to maybe someday get this feature. How about switching providers? ;D

Another option is to use the simple "Merge" strategy, which at least does what we want and keeps the commit hashes. It also creates unnecessary merge commits, but I'm more inclined to live with those than with errors because of missing PR updates. Maybe we could also find an automatic solution via GitHub actions to update existing PRs or rerun the CI, but I would have to look into that and it would probably be very tricky.

peb-adr commented 9 months ago

I think now I understand a bit better what the premise of this issue is getting at.

Yes I think allowing "true" merge commits for PRs in this repo would be a solution that would work. However I (personally) prefer not having merge commits with one branch not being permanently visible, i.e. the PRs branch which is deleted sooner or later.\ Also it's not in line with our other repos.

So I would propose another way of making changes to this repository.

Since this repository is like the defining structural contract between the services, PRs should be treated like an amendment to this contract. Meaning all parties (i.e. all concerning developers) must "sing off" on it (i.e. approve the PR) and it get's merged before any implementation happens.\ Now all the services are still referencing the old meta version anyways - declaring they are not yet implementing the new contract. As they now implement this they update their reference (i.e. openslides-meta submodule) to the new commit that is the previously agreed upon and merged PR.

This would still work using squash merges and will also

avoid checking out open PRs

Also I think it is very much in line with how this repository was conceptualized.

peb-adr commented 9 months ago

Simultanoiusly, we could try to follow our guidelines as meticulously as possible for commits in PRs for the meta repo and use git commit --amend when applicable to at least have "1 commit = 1 task".

Also this wouldn't work for similar reasons as the rebase. --amending a commit is creating a completely new commit based on the last but with the additional change.\ -> It will also create new hashes

jsangmeister commented 9 months ago

Since this repository is like the defining structural contract between the services, PRs should be treated like an amendment to this contract. Meaning all parties (i.e. all concerning developers) must "sing off" on it (i.e. approve the PR) and it get's merged before any implementation happens. Now all the services are still referencing the old meta version anyways - declaring they are not yet implementing the new contract. As they now implement this they update their reference (i.e. openslides-meta submodule) to the new commit that is the previously agreed upon and merged PR.

The problem with this is that you run into problems when multiple changes should be implemented simultaniously. If you look at the current state of the meta repo PRs, there are about a dozen different features currently being implemented. If we would follow your strategy, these would all be merged by now, in some order. When e.g. the backend then wants to work on a single feature (say, the newest commits), all previous feature commits which are currently being implemented in some other backend PR are blocking this. One would have to go ahead and manually rebase and remove the unwanted commits from the meta repo to able to work on the feature in question without the other changes to the models.yml in the way.

Basically, merging everything first before implementing anything means that the linear commit history reverts all advantages of havin PRs and branches in the first place. Therefore I think the only solution is to merge the meta repo PRs only when everything is already implemented. The only remaining question is how to do this merging in the most convenient way.

jsangmeister commented 9 months ago

Simultanoiusly, we could try to follow our guidelines as meticulously as possible for commits in PRs for the meta repo and use git commit --amend when applicable to at least have "1 commit = 1 task".

Also this wouldn't work for similar reasons as the rebase. --amending a commit is creating a completely new commit based on the last but with the additional change. -> It will also create new hashes

Yes, but actively amending a commit means that something changed anyway, so you will most likely update your PR in e.g. the backend together with this change. This at least mitigates the risk of "orphan" meta repo commits in PRs.

jsangmeister commented 9 months ago

Another option @bastianjoel reminded me of again when merging the los-extension branches: Only Github is missing the "fast-forward" merge feature. Via the console, it is easily possible to do a fast forward merge and push it to main. Github even automatically recognizes the related PR and closes it. This is by far my favorite solution now - just prevent merges via github alltogether and simply merge via command line, as we then get to keep the commit hashes intact (no other service PR has to be updated) as well as prevent merge commits or similar.

jsangmeister commented 8 months ago

As disussed: We will use the normal merge option with merge commits because the commit hashes are kept this way. The merge commits are a necessary evil as merging via command line would add more complexity and anti-usability. I already changed the available merge strategies, so I'll close this.