rust-lang / triagebot

Automation/tooling for Rust spaces
https://triage.rust-lang.org
Apache License 2.0
169 stars 75 forks source link

Add new PR review assignment #1745

Closed apiraino closed 6 months ago

apiraino commented 8 months ago

UPDATE 2023-12-12: this second proposal was discussed with @jdno, we agreed a plan detailing how to split it further. I'm going to close this PR and refactor another patch according to this plan.


This is the implementation of a new workflow for assigning pull requests to the Rust project contributors.

This new workflow will assign a pull request to be reviewed based on the number of assigned pull requests to team members.

Everytime a pull request assignment is invoked with r? <team> or r? <team_member> the Triagebot will check the current workload of the candidates and assign the pull request to the team member less busy.

The new workflow is DISABLED by default. It can be enabled by setting the env variable USE_NEW_PR_ASSIGNMENT (any value) and restarting the Triagebot.

Teams that are subject to the new PR review assignment are listed as a comma separated list in the env variable NEW_PR_ASSIGNMENT_TEAMS. See .env.sample for a usage example.

Team members workload is tracked in a new DB table review_capacity.

Both the initial population and the synchronization of this table are handled by a command line tool that will speak to the Triagebot through a number of HTTP endpoints. These HTTP endpoints must be PRIVATE and accessible ONLY to the sync tool. This should be handled at the infrastructure level, the endpoints are not authenticated.

apiraino commented 7 months ago

cc: @Kobzol @jdno since we discussed the refactor of #1719

This PR is in a draft state and waiting for a feedback about the client that will feed and sync the database of team members workload.

To recap: I need a little nudge from T-infra if the approach of using HTTP API requests to populate the DB is correct and compatible with the current infra.

thanks 🙂

jdno commented 7 months ago

Would it be possible to break this apart even further? I'm wondering if we can first change the assignment logic to look purely at the number of currently assigned PRs, then consider the preferences (if they exist) in the second step, and finally figure out how to set them in a third PR.

The motivation for this is to keep each individual pull request small. And as I understand it, we need to consider the case when a user hasn't provided preferences yet anyways.

apiraino commented 7 months ago

hey @jdno yeah I get your point about splitting and in principle I totally agree (thanks for the checking my PR btw :slightly_smiling_face: ).

first change the assignment logic to look purely at the number of currently assigned PRs

Upon reviewing again my code I am not sure it would work great. My understanding is that there is no place in the entire Rust infra where we store the information about how many PRs has a contributor assigned at any given time. The DB table I am introducing here would be the first place where we persist this piece of info. So far, is everything I say correct? I hope I am not missing anything.

Now, without that DB table I would be forced to retrieve dinamically the data. That means that everytime a PR is assigned (or reassigned) the triagebot should emit a number of expensive HTTP calls to Github to calculate the current number of PRs assigned for all members of a team.

Then find the least busy team member, assign them the PR, throw away the result and start all over again at the next PR assignment.

Am I understanding correctly your point?

And as I understand it, we need to consider the case when a user hasn't provided preferences yet anyways.

This would be taken care by the DB table defaults until we have the backoffice where people can set their own prefs.