rust-lang / triagebot

Automation/tooling for Rust spaces
https://triage.rust-lang.org
Apache License 2.0
172 stars 74 forks source link

Workflow for tracking PRs assignment #1754

Closed apiraino closed 7 months ago

apiraino commented 9 months ago

General overview at: https://github.com/rust-lang/triagebot/issues/1753

This patch implements the first part:

No initial sync is planned at this time. Populating the table will happen over time.

This feature is enabled by adding the following to the triagebot.toml in the Rust git repository:

# Enable this feature to keep track of the PR review assignment
[review-work-queue]
apiraino commented 9 months ago

@jdno I think we can have a look at thist first step of implementation, as per the tracking issue #1753

cc: @rust-lang/infra

thanks!

apiraino commented 8 months ago

gentle ping for review @rust-lang/infra

(maybe @ehuss?)

thanks

Kobzol commented 8 months ago

CC @jackh726, if you could please take a look.

jackh726 commented 8 months ago

I'm thinking:

It's pretty trivial to get a list of all open PRs and their assignees:

Here's the graphql query (you do still have to paginate):

query {
  repository(owner: "rust-lang", name: "rust") {
    pullRequests(first: 100, states:OPEN) {
      pageInfo {
        hasNextPage
        endCursor
      }
      nodes {
        number
        assignees(first: 100) {
          nodes {
            login
            id
          }
        }
      }
    }
  }
}

It's probably worth just setting up a sync once on start, and again every 30 minutes, along with listening for events. I don't think only just waiting for the table to populate over time will really work.

I'd be curious to know how long it would take to run a whole "query the current state and update the DB". Probably not that long - even if we way over estimate, we should only have on the order of a 1000 PRs to deal with.

rustbot commented 8 months ago

Error: Invalid triagebot.toml at position 7:2:

TOML parse error at line 7, column 2
  |
7 | [review-work-queue]
  |  ^^^^^^^^^^^^^^^^^
unknown field `review-work-queue`, expected one of `relabel`, `assign`, `ping`, `nominate`, `prioritize`, `major-change`, `glacier`, `close`, `autolabel`, `notify-zulip`, `github-releases`, `review-submitted`, `review-requested`, `shortcut`, `note`, `mentions`, `no-merges`, `validate-config`

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

rustbot commented 8 months ago

Error: Invalid triagebot.toml at position 7:2:

TOML parse error at line 7, column 2
  |
7 | [review-work-queue]
  |  ^^^^^^^^^^^^^^^^^
unknown field `review-work-queue`, expected one of `relabel`, `assign`, `ping`, `nominate`, `prioritize`, `major-change`, `glacier`, `close`, `autolabel`, `notify-zulip`, `github-releases`, `review-submitted`, `review-requested`, `shortcut`, `note`, `mentions`, `no-merges`, `validate-config`

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

apiraino commented 8 months ago

It's probably worth just setting up a sync once on start, and again every 30 minutes, along with listening for events. I don't think only just waiting for the table to populate over time will really work.

Why wouldn't it? Can you elaborate a bit on that?

I'll think about your proposal but my immediate thought I think is that I don't agree with this. We would be losing the concept of a stable "source of truth". I am afraid that querying GH every n-minutes would introduce a point of failure in the consistency, the DB would just become a 'cache' reflecting the GH situation. Then why not just make it a full real cache serializing a json and avoid the DB table altogether?

My reasoning about having the DB table is to have a storage persistence that can be used by the PR review assignment and not only by that. I imagine a future where it could be expanded and used also for statistics, for everything we want to know about PRs and reviews.

rustbot commented 8 months ago

Error: Invalid triagebot.toml at position 7:2:

TOML parse error at line 7, column 2
  |
7 | [review-work-queue]
  |  ^^^^^^^^^^^^^^^^^
unknown field `review-work-queue`, expected one of `relabel`, `assign`, `ping`, `nominate`, `prioritize`, `major-change`, `glacier`, `close`, `autolabel`, `notify-zulip`, `github-releases`, `review-submitted`, `review-requested`, `shortcut`, `note`, `mentions`, `no-merges`, `validate-config`

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

jackh726 commented 8 months ago

We would be losing the concept of a stable "source of truth". I am afraid that querying GH every n-minutes would introduce a point of failure in the consistency, the DB would just become a 'cache' reflecting the GH situation.

The concern I have is that Github is the source of truth right now. Do you expect otherwise?

Then why not just make it a full real cache serializing a json and avoid the DB table altogether?

I mean, this isn't a bad option, to keep just everything in memory. But seems unnecessary.

I imagine a future where it could be expanded and used also for statistics, for everything we want to know about PRs and reviews.

I think this is useful, but not with the current table schema.

apiraino commented 7 months ago

Closing and restarting from scratch in #1773 ...

rustbot commented 7 months ago

Error: Invalid triagebot.toml at position 7:2:

TOML parse error at line 7, column 2
  |
7 | [review-work-queue]
  |  ^^^^^^^^^^^^^^^^^
unknown field `review-work-queue`, expected one of `relabel`, `assign`, `ping`, `nominate`, `prioritize`, `major-change`, `glacier`, `close`, `autolabel`, `notify-zulip`, `github-releases`, `review-submitted`, `review-requested`, `shortcut`, `note`, `mentions`, `no-merges`, `validate-config`

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.