channable / hoff

A gatekeeper for your commits
Apache License 2.0
41 stars 3 forks source link

Repeated merge commands trigger confusing message and corrupt queue state #233

Closed fatho closed 1 year ago

fatho commented 1 year ago

When hoff appears stuck (e.g. due to a missed webhook), a common first reflex might be to trigger hoff again with another merge command. That is not how hoff processes PRs, however, and such repeated merge commands should actually have no effect if the build is still in progress.

The initial merge would trigger a message like

Rebased as acdc1234, waiting for CI …

Then, after more PRs have queued up, but hoff appearing stuck because CI never picked up the build, someone repeats the merge command and hoff prints:

Speculatively rebased as acdc1234 behind 5 other PRs, waiting for CI …

Evidently, it didn't actually rebase the branch again, because the commit hash is still the same, despite (according to the message) there now being more PRs before it.

Interestingly, this immediately triggered the PRs queued after (which were already done on CI) to get merged, including the commit from the original stuck PR.

But since hoff never got a CI update, it was still stuck and blocking the queue. Closing and reopening took it from the queue, but the commit has already been merged to master.

So it appears that issuing a repeated merge command while hoff is already processing a PR corrupts the internal state, moving the PR to the end of the internal queue, but keeping the commit at its original place in the merge train.

Maybe the fix is as easy as rejecting any commands received while a PR is being processed.

robbert-vdh commented 1 year ago

It's been a while since I looked at Hoff but IIRC accepting merge approval commands when a merge was already approved was intended behavior because it would let you change the approval type (merge, merge and tag, merge and deploy to X).

If you take the CI webhooks out of the picture, repeating the merge command should be safe. I've done this before locally without issues. There is only one test case (partially) covers this though:

https://github.com/channable/hoff/blob/957019c795ca8a78640498cf64f5c9b60aaf3f4f/tests/Spec.hs#L3008-L3022

crtschin commented 1 year ago

So in essence, I believe re-issuing the merge command caused the approval order to desync from the commit integration history. We can either

  1. Not set the approval order variable for a PR if it is already being integrated, but do update the merge command used itself, so we still support upgrading merges.
  2. Ignore any merge commands received while a PR is already in the integration queue.
  3. We can do something like retry and reissue the merge command, but remove the PR from its position in the queue and re-add it at the end. So essentially the same as close PR -> re-open PR -> issue merge command

1 and 2 don't quite solve the CI webhoook not being received problem, but they do avoid the multiple merge command issue. 3 causes new webhooks to be received so fixes it both problems, so I'll inclined to go for that. WDYT?

fatho commented 1 year ago

I think (3) has the most straightforward semantics indeed, and does what people would expect it to do.