Source for frequent "unstable statuses" errors: PR is ready for merging

rdohms commented 1 month ago

Prerequisites

[X] I have written a descriptive issue title
[X] I have searched existing issues to ensure the issue has not already been raised

Issue

I would not call this a bug report or a feature request; maybe it's just an interesting finding that we could learn from and implement.

I set up a simple repository to try out this action. It basically has lots of dependencies and a true == true test suite, so it frequently has dependency updates, and the tests always pass to see the overall behavior.

I often see the Pull request is in unstable status error. And I was trying to figure it out. I had required checks, so that was not the issue.

This issue on another repo mentions that the API will also return this error as "all required checks are already green." So if this action takes longer (I'm running it in an isolated workflow and using auto-merge) than all other checks, it will try to auto-merge and fail because merging is already possible. You can also see this in the UI, as once the PR is ready for merging, the button becomes "merge," not "auto-merge."

One way to reduce this would be to flip approval and auto-merging. If we only approve the PR after we have set auto-merge, it will happen less often as requiring approval is frequently adopted. It won't always fix the issue, but it reduces the chances of it happening. Some food for thought: it's also not a straightforward decision.

simoneb commented 1 month ago

Can you please share the repro-repo?

rdohms commented 1 month ago

Alternatively, we would build in a fallback "if merging is already possible, skip auto-merge, just merge".

Let me transfer the repo to outside work and see if I can get it to be consistent, this whole thing is aggravated by our runner configuration.

simoneb commented 1 month ago

Let us know when you have a repro and we can take a look.

rdohms commented 1 month ago

Ok here is a reproducible PR/Repo: https://github.com/rdohms/dpdbot-test/pull/4

I have made the workflow that runs auto-merge run for 180s so it will always finish after everything else is finished. It would be similar to running it in the same workflow but after all steps. The key here is to use auto-merge as the defining factor.

Possible solutions:

From the calling workflow: We could do a gh pr status and check the status, thus deciding to flip the config of auto-merge on or off. However, this would still be prone to race conditions.
Inside the action code, when we call setAutoMerge, we could handle the error above and try a merge, as it would either pass (everything is okay) or fail the same way if something really wrong is happening.

simoneb commented 1 month ago

The behavior you're seeing is expected, and it's described in the readme. On the other hand I can understand that the scenario you're trying to achieve is a reasonable one, so I'm happy to consider your suggestions. If, once you're about to enable auto merge, the PR is already in a state that can be merged, just merge it.

I believe the first solution that you suggest, assuming that gh pr status gives a response that we can univocally interpret as "ready to merge", is the most obvious. If the PR is ready to merge, merge it already instead of trying to enable automerge, which will fail anyway.

rdohms commented 1 month ago

Yeah, I wish Github would return a more predictable error from the Graph API, or even better, offer this up in a REST API with some better error exceptions.

gh pr status will tell you if a PR is approved and all required checks are green. However, that is not 100% of the data needed; it would reduce chances but not solve them.

PRs from API do have a mergeable property. I remember finding an edge case for this, but can't recall what it was. I'll play around with it a bit.

rdohms commented 1 month ago

I have been able to write a workflow that does the check as we talked about and adjusts the configuration, but I hit some snags. I'm dropping all the info here just for everyone following along so we can keep context.

The check I wrote:

      - name: Merge or Auto-Merge
        id: use-auto-merge
        uses: actions/github-script@v7
        with:
          result-encoding: string
          script: |
            const pr = await github.rest.pulls.get({
              pull_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
            });
            return pr.data.mergeable === 'MERGEABLE' && pr.data.merge_state_status === 'UNSTABLE';

It works in terms of getting the data. However, even though, according to the previous errors, this should be the combination, the "mergeable" state is actually not accurate. For example, a PR with checks still running will show as "mergeable" as it only talks about the state of git (i.e., no conflicts).

So this leads to a proper flip to merge instead of auto-merge but then returns an error, as GH reports running/failed checks.

What I had done in some previous code is to get a list of running checks and maybe a list of required checks and leverage that information. I'm gonna explore these options now to see how complex this gets.

It all keeps pointing back to GH; there is just no way of saying "PR is ready" via API data. Maybe the answer to this whole endeavor is "don't run auto-merge as a separate flow," or maybe "auto-merge" should be triggered on an event instead, not on PR events.

simoneb commented 1 month ago

Thanks for the follow up. How about the alternative solution you had suggested? Try merging and if it fails, enable automerge. Feels a little forced as a solution, but do you see any downsides in doing that?

rdohms commented 3 weeks ago

So, the final solution that worked for us, without touching the actual code of the action, was to check for running checks.

- name: Merge or Auto-Merge
        id: use-auto-merge
        uses: actions/github-script@v7
        with:
          result-encoding: string
          script: |
            const checks = await github.rest.checks.listForRef({
              ref: context.payload.pull_request.head.ref,
              owner: context.repo.owner,
              repo: context.repo.repo,
            });
            const runningChecks = checks.data.check_runs
              .map(check => check.status)
              .filter(status => status === 'in_progress' || status === 'queued')
              .length;
            return runningChecks > 1;

The logic here is:

if # of checks queued or running is 1: the only thing left to run is this check, so auto-merging will not be available
if # is higher, there may still be a required step running, so try auto-merging.

It still leaves a margin of error if the running check is not required, but it covers 90% of cases.

I still think the ideal scenario would be for the action code to try auto-merging and upon an "unstable" attempt to merge directly. You could go a bit more complex and get all the data that can validate the status, but there is no easy endpoint for that.

simoneb commented 2 weeks ago

This may work, but it has an inherent race condition, which could lead to either false positive or false negatives, neither of which is desirable.

You may try to enable automerge on a PR whose checks have already finished (because they may be running when you check, but finished when you try to enable automerge), or you may try to merge a PR whose checks haven't for whatever reason started when you check them, but are running when you try to merge.

I appreciate that neither of this is extremely likely to happen, but I wouldn't find it very compelling to implement a solution that we already know has an inherent logical problem in it.

See what I mean?

fastify / github-action-merge-dependabot

Source for frequent "unstable statuses" errors: PR is ready for merging #653

Prerequisites

Issue