hub4j / github-api

Java API for GitHub
https://github-api.kohsuke.org/
MIT License
1.13k stars 727 forks source link

Handle rate limit errors appropriately :: GitHub Doesn't always send `Retry-After` header for secondary rate limits #1805

Closed ranma2913 closed 1 month ago

ranma2913 commented 7 months ago

Describe the bug The GitHub Docs now say:

Otherwise, wait for at least one minute before retrying. If your request continues to fail due to a secondary rate limit, wait for an exponentially increasing amount of time between retries, and throw an error after a specific number of retries.

There's also a whole complex list About secondary rate limits.

To Reproduce Reach Secondary Quota Limits:

"You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later"

Expected behavior GitHub SHOULD return a retry-after header but it does not. So I've thought of a couple ideas:

  1. Make the GitHubConnectorResponseErrorHandler.isError() method public down the stack so users (me) can customize the behavior.
  2. Update the private GitHubAbuseLimitHandler.isError() logic to handle no Retry-After header as described in the GitHub Docs. This may not be ideal because note everyone is running a long-running project like I am so they may not be OK with a default 1-minute sleep before throwing a RetryRequestException.

Desktop (please complete the following information):

Additional context BTW I'm using a GitHub Enterprise User Account accessing GH Enterprise resources so maybe that's why there's no Retry-After header?

Another Idea is related to rateLimitChecker.checkRateLimit(this, request.rateLimitTarget()); inside the GitHubClient.sendRequest method. If it could also pass the GitHub request instead of just the rateLimitTarget then I could configure more smart ratelimit checker which tracks api calls of differing "points" to reduce throughput of the 'create content' requests which GitHub limits much more heavily than Read requests.

bitwiseman commented 6 months ago

The guidance in "about secondary rate limits" is not great. It boils down to "if you make heavy use for our API, you'll probably hit the secondary rate limit sooner or later but we're not going to give you any way to know if you're getting close. That's a you problem." 😡

bitwiseman commented 6 months ago

@ranma2913

Thanks for filing this.

GitHub SHOULD return a retry-after header but it does not. So I've thought of a couple ideas:

  1. Make the GitHubConnectorResponseErrorHandler.isError() method public down the stack so users (me) can customize the behavior.

Not sure why we would want this. The determination of if there is an error is universal and should never need to be customized by users. Whatever you changes you want to do, do it in this library for the benefit of all.

  1. Update the private GitHubAbuseLimitHandler.isError() logic to handle no Retry-After header as described in the GitHub Docs. This may not be ideal because note everyone is running a long-running project like I am so they may not be OK with a default 1-minute sleep before throwing a RetryRequestException.

Yes, do this please. The secondary rate limit is essentially a renamed abuse limit. The consequences are the same. If a client is not okay waiting they can use AbuseLimitHandler.FAIL.

BTW I'm using a GitHub Enterprise User Account accessing GH Enterprise resources so maybe that's why there's no Retry-After header?

Maybe? But we still need to handle the case. The docs actually mention it as a possible scenario.

Another Idea is related to rateLimitChecker.checkRateLimit(this, request.rateLimitTarget()); inside the GitHubClient.sendRequest method. If it could also pass the GitHub request instead of just the rateLimitTarget then I could configure more smart ratelimit checker which tracks api calls of differing "points" to reduce throughput of the 'create content' requests which GitHub limits much more heavily than Read requests.

Whatever smarts you're talking about implementing, it is very likely something that all users would benefit from.

Similar to primary rate limit behavior, basically everyone needs this functionality. Unlike primary rate limits, there is no concrete information that we can provide to the clients that they could use to avoid exceeding the limit. You either stay under the limit or you don't. The guidance provided about how to estimate usage and limits (time from request to response and then totaling that per minute) is best implemented once for the benefit of all.

IgnatBeresnev commented 5 months ago

I've looked at the code, and it seems like there's an inconsistency between GitHubAbuseLimitHandler#isError() and AbuseLimitHandler.WAIT.

AbuseLimitHandler.WAIT already checks for the missing Retry-After header, and sleeps for 1 minute if it's absent:

https://github.com/hub4j/github-api/blob/6b2a487b2800435d614023418197a6300774255d/src/main/java/org/kohsuke/github/AbuseLimitHandler.java#L89-L92

However, I think it will never get to here, because GitHubAbuseLimitHandler#isError() will say it's NOT an error if the Retry-After header is missing:

https://github.com/hub4j/github-api/blob/6b2a487b2800435d614023418197a6300774255d/src/main/java/org/kohsuke/github/GitHubAbuseLimitHandler.java#L31-L33


It looks to me like connectorResponse.header("Retry-After") != null; should be dropped, as the docs indicate that the header is optional

If the retry-after response header is present, you...

It looks like just this change alone should already make it safer when it comes to hitting secondary rate limits.


However, to make it really safe, it looks like we need to do what GitHub suggests:

If your request continues to fail due to a secondary rate limit, wait for an exponentially increasing amount of time between retries..

That is, increasing this one minute sleep exponentially. For this, either changes in the API will be needed (to pass a counter to onError()), or I as a user should be able to add an AtomicLong counter in my own implementation, and increase the sleep duration based on that (I'd rather sleep indefinitely and trigger healthchecks than get banned).

But anyhow, it all seems to start with fixing the check in GitHubAbuseLimitHandler#isError()

If this sounds reasonable, would you be able to fix it, or is it better if I send a PR?

bitwiseman commented 5 months ago

Please submit a PR.

jeetchoudhary commented 3 months ago

See if this makes sense https://github.com/hub4j/github-api/pull/1853

bitwiseman commented 3 months ago

@jeetchoudhary Your PR is interesting. What are the contents of the gh-limited-by header that you're seeing? It doesn't appear to be a documented feature.

jeetchoudhary commented 3 months ago

@bitwiseman I have an open ticket with GitHub enterprise support to confirm the usage of this header. Since I couldn't find it documented anywhere, I figured getting confirmation was better. I found this while debugging the application, and seemed like a logical thing to do(I don't think removing Retry-Header check completely is a good idea). So far this works and got us unblocked, but I'll update once I hear more from GitHub enterprise support engineers.

jeetchoudhary commented 3 months ago

image

jeetchoudhary commented 3 months ago

@bitwiseman ^

bitwiseman commented 3 months ago

If you want to copy the existing test and data files for abuse limit and then manually change the header name and value in the test data, that would be fine for your current PR.

Or you can wait for the docs, your call.

jeetchoudhary commented 3 months ago

I'll update the tests today. Thanks

bitwiseman commented 3 months ago

@jeetchoudhary
Thanks so much for your work on this. If you can please update this ticket when you hear back from support with details and/or that the docs have been updated.

jeetchoudhary commented 3 months ago

Will do