llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.63k stars 11.36k forks source link

Write a guide for doing "Stacked Reviews" with GitHub Pull Requests #56636

Open tstellar opened 2 years ago

tstellar commented 2 years ago
llvmbot commented 2 years ago

@llvm/issue-subscribers-infrastructure

rengolin commented 2 years ago

I'd like to challenge the idea that we need to "provide similar functionality to Phabricator's "Stacked Reviews"".

We need a good way to do stacked reviews in Github. It already has a way to review multiple commits in a single PR by, well, just reviewing a branch that has more than one commit.

It may not be a good way, it may need some restraint with force-pushes, but that is the natural way PRs do stacked reviews.

Trying to emulate Phab is leading us into the wrong path, trying to force a feature that doesn't exist in GH and will not solve the problem we're trying to solve: make it easier for external people to put their code for review in LLVM.

If we end up with a convoluted set of rules, being on GH will make no difference to how hard it is to do code review in LLVM, but it will definitely make the lives of current reviewers hard, forcing us to learn (and teach) a whole new way of doing things that are very likely slightly worse than previously (because it's more unnatural).

joker-eph commented 2 years ago

It may not be a good way, it may need some restraint with force-pushes, but that is the natural way PRs do stacked reviews.

This does not fit many of the requirements: in particular shard the discussion with multiple reviewers that care only about a subset of the contribution, handle this through the lifecycle of the contribution and its review, the ability to get the subset of the contribution approved and merge it independently of the subsequent patches.

While I can agree that the common trap is to want to have something too close to "the way it is done right now", throwing away all requirements is not appropriate either.

rengolin commented 2 years ago

throwing away all requirements is not appropriate either.

I'm not sure where in my reply you managed to find that meaning.

tschuett commented 2 years ago

Maybe Stacked View is misleading. You want to develop a larger feature that has to be split into smaller pieces. GitHub supports task lists in issues: https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists Rust calls it tracking issue: https://github.com/rust-lang/rust/issues/94829

joker-eph commented 2 years ago

The difference with a "task lists" or "tracking issue" as I understand it is that in this case the PRs can be reviewed and submitted totally disconnected. That is two PRs don't have dependencies and opening one does not rely on the code of the other. Alternatively, you can accomplish a "task list" by waiting for the first PR to be merged before opening the following one that depends on it.

Stacked PRs allows to open and review dependent changes separately as a pipeline of contributions.

tschuett commented 2 years ago

The first SPIRV part was a Stacked Review and it took months until it was committed atomically. Maybe this not the right approach for throughput.

hubert-reinterpretcast commented 2 years ago

This does not fit many of the requirements: in particular shard the discussion with multiple reviewers that care only about a subset of the contribution, handle this through the lifecycle of the contribution and its review, the ability to get the subset of the contribution approved and merge it independently of the subsequent patches.

I am hopeful that using a single PR (with improvements) for stacked reviews would be workable if GitHub first finds a way to handle re-associating in-line comments on rebases/amends. Maybe if they had a force-push UI where the user has to tell it where the commits with attached comments appear in the new branch.

After that, maybe something like using subtopic tags on individual commits would allow for filtering/approval by tag. Once a PR is approved for some tag, the author reorders their commits so the approved part sits on top of the base branch and the UI would be made to provide merge options that land just that part.

asl commented 2 years ago

I am hopeful that using a single PR (with improvements) for stacked reviews would be workable if GitHub first finds a way to handle re-associating in-line comments on rebases/amends. Maybe if they had a force-push UI where the user has to tell it where the commits with attached comments appear in the new branch.

I would not rely on assuming GitHub will do anything :)

asl commented 2 years ago

On Discourse there is an indication that graphite.dev handles stacked reviews better. It would be great to investigate this possibility. https://docs.graphite.dev/getting-started/the-graphite-workflow

rengolin commented 2 years ago

Two questions on that doc:

  1. It's unclear to me by reading it, but is that creating branches on the main repo or the developer's fork?
  2. That allows (encourages?) merging one PR of the stack before others. Is there a way to block that and force merging all or none?
dwblaikie commented 2 years ago

That allows (encourages?) merging one PR of the stack before others. Is there a way to block that and force merging all or none?

That (merging one patch of the stack before others) is the status quo with Phab, right? We don't have a way to block/require an all-or-nothing approval on a patch series that I know of?

rengolin commented 2 years ago

That (merging one patch of the stack before others) is the status quo with Phab, right? We don't have a way to block/require an all-or-nothing approval on a patch series that I know of?

Correct, no way to block, other than asking people nicely.

I'm (just a little) concerned with their documentation. If we point developers to that, they'll assume we're also fine with that kind of operation.

It would be nice to have a way to block, it, but not a deal breaker.

dwblaikie commented 2 years ago

Yeah, given the challenges of this migration already - it's probably important to consider improvements to the workflow as out-of-scope for the transition, since there are enough functionality changes/losses (depending on perspective/particular workflows) that finding workarounds/retraining people for those is enough of a challenge :s - good things to keep in mind for wherever we end up/when we end up there to keep improving things

nhaehnle commented 2 years ago

@llvm-beanz started another related discussion on Discourse: https://discourse.llvm.org/t/rfc-revisiting-linear-history-vs-merge-commits/64873/2

tstellar commented 1 year ago

The most simple implementation of this would be just put links to child pull requests in the initial comment of the parent pull request. My questions are:

joker-eph commented 1 year ago

"links to child" isn't half the problem, here are more interesting questions I think:

tstellar commented 1 year ago
* what is the review lifecycle? That is: if you have to update the 2nd PR in the stack, what do you do for PR # 3 and PR # 4?

How does this work in Phabricator?

jh7370 commented 1 year ago

The most simple implementation of this would be just put links to child pull requests in the initial comment of the parent pull request. My questions are:

* Does this simple implementation provide any value at all over not having stacked pull requests?

It's not clear to me in what form the child PRs will take in this case. For example, if they are just another PR based off on main, then they'd be the first PR plus extra commits. Links from parent PRs to child PRs would be useful though. I believe you could achieve this by stating in the child something akin to "Depends on #XXX" which would generate a mention in the parent PR. This functionality already exists in Phab (literally adding Depends on DNNNNNN creates a parent/child link).

* What features does the stack review feature in Phabricator offer beyond just providing links between the different commits?

Ability to create a patch that physically depends on another in-flight patch, whilst being able to review and approve them all separately. The existing GitHub PR system allows you to review multiple or individual commits within a PR, but all the messages end up getting mixed up in the Overview tab, and you can only approve the PR as a whole (plus there are other issues regarding how to handle fixup commits/force pushes etc). To be more precise, the "stack" system itself isn't required for this, since Phab allows you to upload arbitrary diffs -that isn't true in GitHub, because PRs are based on real commits and branches in a repo somewhere.

Additionally, tools can read the stack to determine how to handle stacked PRs. For example, Phab pre-commit testing does this to ensure it has applied the patch on top of the correct base patch. I guess with sufficient adaptation, the tooling could be adapted to read these links from the PR descriptions, so this one isn't really an issue.

* what is the review lifecycle? That is: if you have to update the 2nd PR in the stack, what do you do for PR # 3 and PR # 4?

How does this work in Phabricator?

I think it's entirely up to the contributors involved. I've had cases where I've bothered to update the downstream PRs and cases where I haven't - to some extent it depends on how much of an impact updating PR 2 has on 3 and 4. If I haven't updated 3 and 4, e.g. because the change in 2 doesn't have a direct impact on them, then I usually don't bother ever doing it in the UI, and only rebase locally prior to pushing the sequence. With GitHub, you can do the same, except that you usually have to rebase the later PRs prior to merge at the GitHub end (because GitHub is the thing doing the merging, and landing the previous commit would usually end up with a disjoint history with the later PRs relying on a commit that didn't actually ever land - GitHub rewrites the commit title by appending the PR link to the title).

joker-eph commented 1 year ago

I think Phabricator would be roughly equivalent of:

So assuming that locally it looks like: commit A -> commit B -> commit C -> commit D (HEAD).

You can have 4 pull-requests:

git push origin HEAD~3:users/mehdi_amini/feature_1 # commit A (the oldest)
git push origin HEAD~2:users/mehdi_amini/feature_2 # commit B
git push origin HEAD~:users/mehdi_amini/feature_3 # commit C
git push origin HEAD:users/mehdi_amini/feature_4 # commit D (the newest)

That creates 4 remote branches, and then you open:

Each PR only shows its own commit, focusing the review on it. You can checkout locally any PR and you get all the ones it is based on as well. When the first PR is merged, the second one is automatically retargeted to main I believe.

Working through the lifecycle of the stack is all scriptable, for example skim through the readme here: https://github.com/ejoffe/spr (I have't tried this particular script).

I would add that this is even somehow superior to Phabricator in how the PR are naturally connected and checking out the code for one ensures self-consistency, the only part where GitHub is lacking compared to Phab (I think) is the UI and the way it tracks comments across updates.

nhaehnle commented 1 year ago

I think Phabricator would be roughly equivalent of:

* keeping a single branch locally

* push individual commits of the current branch to individual remote branch in the LLVM repo (not a fork)

* open cascading pull-requests (this can't work when branches are in a fork I believe)

* when updating a pull-request, it is locally a rebase+amend workflow: the branch locally always looks like the final state.

* force-pushes are used to update the remote branches in the LLVM repo.

I have some limited experience with something similar in a different (downstream from LLVM) project where branches in the main repository are not allowed. The key difference is:

It's annoyingly manual, but it does work when you keep a clean commit history (as we should!) because GitHub's diff view allows you to select an individual commit to look at when reviewing. (It doesn't model the idea of approval of an individual commit, so it's a matter of manually writing something like "Patch N ("headline of patch") LGTM."

By the way, to make this workflow a bit more pleasant on the reviewer's side (which is arguably the most important part in all of this, since review bandwidth is a serious bottleneck), I wrote this tool which does take some getting used to but provides reasonably clean diffs after rebases, and can also act as an IMHO nicer alternative to git range-diff: https://git.sr.ht/~nhaehnle/diff-modulo-base

tru commented 1 year ago

Have anyone seen or evaluated Ghstack ?

that seems like it could fit the bill in theory.

rengolin commented 1 year ago

From their README:

Make sure you have write permission to the repo you're opening PR with.

Seems they only automate the process, don't bring new features.

smeenai commented 12 months ago

I've seen it used for pytorch successfully, but yup, it requires the ability to create new branches in the repo you're sending the PR too. Would we consider relaxing the "no branches in the main repo" restriction to enable stacked PRs?

joker-eph commented 12 months ago

Would we consider relaxing the "no branches in the main repo" restriction to enable stacked PRs?

I believe this is something that was identified as the best tradeoff in one of the issue tracking stacked PRs. Ideally we should have naming convention for the allowed branches, some "garbage collection" mechanisms (ensuring outdated PR gets closed / branches gets deleted). We could also use a refs namespace that isn't cloned/fetched by default by git I think?

smeenai commented 12 months ago

I played around with https://github.com/ejoffe/spr and https://github.com/ezyang/ghstack. (Disclaimer: ghstack was developed for Pytorch by a Meta engineer, and I also work at Meta, but I'm not affiliated with the ghstack author or Pytorch otherwise.)

ghstack is installed via pip. spr can be installed via Homebrew or apt, and also has precompiled binaries available. I'm not sure if one of those is generally preferred for corporate environments. Neither of them require root to install, and Python 3.4 and newer guarantee pip being present.

spr relies on the gh CLI for authentication, so you can get going right away if you have that already set up. ghstack requires an OAuth token to be specified manually (though you could use the same token as for gh).

Both of them require branches to be created in the main repository, which seems to be a fundamental GitHub limitation. ghstack creates its commits under refs/heads/gh and spr under refs/heads/spr. I didn't spot any configuration option to switch those defaults, and I'm not sure if the tooling would be happy with something not under refs/heads.

ghstack has a more complex branching scheme, which is described in https://github.com/ezyang/ghstack#structure-of-submitted-pull-requests. The aim is to not lose review comments or context when rebasing, which is a problem that's been discussed here. spr does something simpler and just creates one branch per PR.

ghstack adds lines like the following to each commit:

ghstack-source-id: 68060e855f045fe40425d494ee09f07f13645051
Pull Request resolved: https://github.com/smeenai/llvm-project/pull/1

spr adds the following instead:

commit-id:e61775a8

Both are a little noisy, though the PR URL is potentially useful. I dunno if we can scrub them out automatically during the land.

The stack that I created with ghstack is: https://github.com/smeenai/llvm-project/pull/1 https://github.com/smeenai/llvm-project/pull/2 https://github.com/smeenai/llvm-project/pull/3

The stack that I created with spr is: https://github.com/smeenai/llvm-project/pull/25 https://github.com/smeenai/llvm-project/pull/26 https://github.com/smeenai/llvm-project/pull/27

Each PR has a link to the full stack. The big difference is that when updating a stack, ghstack adds new commits on top, whereas spr just rebases. You can see the difference in https://github.com/smeenai/llvm-project/pull/3/commits vs. https://github.com/smeenai/llvm-project/pull/27/commits. In both cases, I amended an existing commit and then updated the stack. ghstack automatically creates an update commit, which lets reviewers see changes made in this update (while still having the Files Changed view to review the overall PR). spr just force pushes the updated commit instead.

IMO although ghstack is a little bit harder to setup, its more complex branching scheme which preserves individual updates and avoids force pushing is a pretty killer feature. I can ask about the possibility of creating branches not under refs/heads if we want to consider moving forward with it.

hubert-reinterpretcast commented 12 months ago

Each PR only shows its own commit, focusing the review on it. You can checkout locally any PR and you get all the ones it is based on as well. When the first PR is merged, the second one is automatically retargeted to main I believe.

In my experience, while nothing has landed, this generally works. I have never seen the workflow work properly once the first PR is "merged" using "squash and merge" (which is the easiest way to maintain linear history).

The "downstream" PRs then need to be all rebased (and maybe GitHub has gotten better with inline comments and rebasing, but if it has, I haven't noticed).

smeenai commented 12 months ago

Ah, damn, ghstack requires force pushing, so that wouldn't work for us: https://github.com/ezyang/ghstack/issues/50. I'm not really sure why though; I'll ask on the issue.

smeenai commented 12 months ago

We could also use a refs namespace that isn't cloned/fetched by default by git I think?

GitHub seems to disallow pull requests from refs not under refs/heads/, unfortunately. I hacked up ghstack to try it but the GitHub API rejects the pull request creation, and I also can't figure out any way to do so from the UI (the ref just doesn't show up).

joker-eph commented 12 months ago

Thanks for trying! We should adopt a convention for these branches, and update our docs with a ref spec that would exclude them for normal fetches for day-to-day development (git clone -c remote.origin.fetch=...).

smeenai commented 12 months ago

Ah, damn, ghstack requires force pushing, so that wouldn't work for us: ezyang/ghstack#50. I'm not really sure why though; I'll ask on the issue.

I misread the code here; I think it'll end up working out for LLVM by accident (IIUC we do have branch protection for main set up, but non-admins can't see the branch protection settings, so ghstack will be fine ... I don't understand this super well though).

Thanks for trying! We should adopt a convention for these branches, and update our docs with a ref spec that would exclude them for normal fetches for day-to-day development (git clone -c remote.origin.fetch=...).

Both ghstack and spr create their branches under their own refs/heads namespace, so it's straightforward to have a refspec that excludes them (at least as of Git 2.29 and negative refspecs). It seems like every user would need to configure that refspec themselves though, and hopefully we don't have a significant population of users with a git version older than 2.29?

aemerson commented 11 months ago

There's another tool also called spr which seems to be distinct from the one mentioned earlier: https://github.com/getcord/spr

I have no idea which is the best, haven't tried any of them.

joker-eph commented 10 months ago

https://github.com/getcord/spr

This looks pretty nice! They even claim trying to provide a similar experience to arc. All-in-all these tools seem pretty similar, the only thing we need to experiment is to open a namespace for user branches in the repo. To avoid any misuse of this namespace, we can have an action that collect the open branches that don't have a matching pull-request and propose to delete them (that would cover closed PR where the user does not delete the branch afterwards).

@tstellar can we open a namespace like refs/heads/users/.* for example?

tstellar commented 10 months ago

@joker-eph Do we know how this will affect the repo size? I thought there was an issue where refs that are present when a repo is forked can't be deleted or garbage collected.

joker-eph commented 10 months ago

I technically can't have a zero-impact: any single line of code has impact, in practice though it should be noticeable: we only keep branches for open pull-requests, so code that is meant to merge anyway. It's also intended to be used only for stacked PR, it shouldn't be crowded. For bots we can exclude this namespace from the refspec if we wanted to ensure it is zero impact.

For the overall repo garbage collection through forks issue, this seems no-impact since indeed it is an issue that already affects forks, where branches are pushed today.

joker-eph commented 10 months ago

@tstellar can we open a namespace like refs/heads/users/.* for example?

Ping @tstellar ? I'll happily work on scripting to auto-delete branches in such namespace after you open it.

smeenai commented 10 months ago

Do the tools support easily customizing their branch namespace, or should we just open up their default branch namespaces (e.g. refs/heads/gh and refs/heads/spr)?

joker-eph commented 10 months ago

Do the tools support easily customizing their branch namespace

I tried spr and this is the first thing the configuration asked me for.

tstellar commented 10 months ago

@joker-eph I've removed the branch creation restriction for users/*/

bb010g commented 9 months ago

Stacked Git (StGit) may also be worth consideration, though it currently doesn't support nested stacks (for bulk reordering of patches) or GitHub's API for pull requests. I haven't tested StGit with spr yet.

jroelofs commented 8 months ago

https://github.com/getcord/spr

I have been trying this one. Things have been going mostly smoothly up until the point where I start to land a stack of commits on main, then it gets a little awkward. AFAIU, you need to git checkout each commit starting with the earliest, spr land it, and then rebase the rest of the stack on the new main, manually edit the second PR's base branch to main, and repeat. Either I'm holding it wrong, or spr needs a spr land --all that does all that under the hood for you.

jh7370 commented 8 months ago

I had real trouble reviewing a PR created via SPR. Initially it was fine, but at some point they rebased the PR and this made it impossible to sensibly view what had actually changed, because commits in the "Files changed" had become linked in such a way as to make it impossible to view the important one only, and the full diff contained all sorts of completely unrelated changes pulled in from the rebase. Unfortunately, I didn't make a note of the PR number for others to look at.

nhaehnle commented 8 months ago

This kind of problem is why I wrote diff-modulo-base. The tool is admittedly a bit rough around the edges and it doesn't understand SPR -- I primarily use it with PRs that just natively have a sequence of commits and are rebased. But it could probably be extended.