bazelbuild / continuous-integration

Bazel's Continuous Integration Setup
https://buildkite.com
Apache License 2.0
259 stars 135 forks source link

bcr-pr-reviewer: Avoid spamming module maintainers #2000

Closed meteorcloudy closed 2 months ago

jsharpe commented 2 months ago

This doesn't seem to be the correct fix because the root of cause of this happening is when existing PRs get rebased AFAIK?

meteorcloudy commented 2 months ago

The bot cannot prevent PRs authors from accidentally messing up with the rebase? What it can do is to detect that too many people will be notified and it's likely an accident.

jsharpe commented 2 months ago

But surely the issue is that its comparing with the wrong base commit? a rebased PR should still only have a diff that contains the changes that the original author created and so should still only have a delta of the changes that are for the current PR; I believe this is a bug in the diff logic?

meteorcloudy commented 2 months ago

Hmm, that's a good question. But that's what GitHub shows at https://github.com/bazelbuild/bazel-central-registry/pull/2344/files and the script is just using GitHub API to list the changed files: https://github.com/bazelbuild/continuous-integration/blob/cb00eeace934a460794170a68221b16d63474bbe/actions/bcr-pr-reviewer/index.js#L11

meteorcloudy commented 2 months ago

Maybe it's possible the PR is based on an old commit on the main branch, and there will be merge conflicts.

jsharpe commented 2 months ago

Maybe it's possible the PR is based on an old commit on the main branch, and there will be merge conflicts.

but for the purpose of establishing maintainership I'm not sure this matters - you just need the list of files that have been modified to establish ownership since ownership should be based from what is in main, the fact that there is a conflict is irrelevant? I think its sufficient to note that there are changes in a given file.

I think the following psuedo operations should work to establish the changed file list:

git checkout PR_HEAD
git reset --soft MERGE_BASE
git stash
git reset --hard origin/main
git stash apply

Once you've done this then the output of git status should contain the list of files that have been changed and then maintainership for this list should be used to construct the list of users to notify?

Ultimately though this is a github API bug - I've seen this in other repos that the diff doesn't correctly recalculate in all cases when already merged commits are included in a PR that was opened before the merge occurred; seems they have some caching bugs when recalculating the diff. Maybe the above is too much effort and overhead and we should just report this upstream to github and wait on them to fix it with this PR's workaround as a stop gap until github is fixed.