palantir / policy-bot

A GitHub App that enforces approval policies on pull requests
Apache License 2.0
780 stars 108 forks source link

Don't look up approval candidates when no approval is required #808

Closed iainlane closed 3 months ago

iainlane commented 4 months ago

We always check the approval candidates when evaluating an approval policy. This process ends up making an API call to list the comments on the PR, so that the approvers given in the policy can be checked against the commenters or reviewers on the PR and only the relevant people considered.

This search currently happens regardless of whether any approvals are required. If there aren't any then we can save the API calls and a bit of time by skipping the search. PRs can be evaluated fairly frequently depending on the policy, so this can add up to a lot of API calls.

I noticed this because we had a problem yesterday where we somehow managed to run out of GitHub API quota which I didn't think would be reasonably possible 😱 . This query stood out when I checked the logs, but I'm not sure it's the end of the story.

Hopefully I'm not missing a reason why this was the way it was. :-)

bluekeyes commented 4 months ago

I noticed this because we had a problem yesterday where we somehow managed to run out of GitHub API quota which I didn't think would be reasonably possible 😱 . This query stood out when I checked the logs, but I'm not sure it's the end of the story.

Policy Bot can be pretty request-heavy. I think we've done most of the big optimizations, but I'm sure there's more stuff like this that we've missed.

Another note is that GitHub App rate limits scale based on users and repositories in the organization (up to 12,500 requests/hour.) If you work primarily in a monorepo, you may have a lower rate limit than you would if an equivalent level of development was spread across multiple repositories. GitHub Support may be able to make a custom adjustment to your rate limit in that case.

iainlane commented 4 months ago

I noticed this because we had a problem yesterday where we somehow managed to run out of GitHub API quota which I didn't think would be reasonably possible 😱 . This query stood out when I checked the logs, but I'm not sure it's the end of the story.

Policy Bot can be pretty request-heavy. I think we've done most of the big optimizations, but I'm sure there's more stuff like this that we've missed.

One other one that I've noticed we could do is to allow the default repository to be disabled. It seems like 404s aren't cached and so we do a lot of requests to check for grafana/.github having a policy and find it doesn't every time. I'll send over a PR for that shortly. (edit: this was merged in https://github.com/palantir/policy-bot/commit/550e0326d417032a6a7ca64010e592913e0f3cda - thanks!)

Another note is that GitHub App rate limits scale based on users and repositories in the organization (up to 12,500 requests/hour.) If you work primarily in a monorepo, you may have a lower rate limit than you would if an equivalent level of development was spread across multiple repositories. GitHub Support may be able to make a custom adjustment to your rate limit in that case.

Cheers for the tip. For us, it's not only one monorepo that drives activity. It is one of those but also a few of the larger OSS repos we have.

Looking into what happened a bit more, I noticed a few things:

image

I'm sending in fixes for what I can see and then continuing to observe the effects they have!

Just a brain dump 🙂

bluekeyes commented 3 months ago

The requests with no installation ID are interesting. It could be application requests using the JWT, but I don't think we're making many of those. If you have debug logging enabled, it might be helpful to look at the logs with the github_request message during that period to see if they also show the requests with no installation ID.

It's also possible we're not correctly attaching the installation ID to the metrics in some situations, but the fact that the maximum limit is different from your actual installation makes that seem less likely.