Handle rate limit errors appropriately :: GitHub Doesn't always send `Retry-After` header for secondary rate limits

ranma2913 commented 7 months ago

Describe the bug The GitHub Docs now say:

Otherwise, wait for at least one minute before retrying. If your request continues to fail due to a secondary rate limit, wait for an exponentially increasing amount of time between retries, and throw an error after a specific number of retries.

There's also a whole complex list About secondary rate limits.

To Reproduce Reach Secondary Quota Limits:

At least 5 GET rps
At least 1.4 POST rps
Run until you get Secondary Quota Error:

"You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later"

Expected behavior GitHub SHOULD return a retry-after header but it does not. So I've thought of a couple ideas:

Make the GitHubConnectorResponseErrorHandler.isError() method public down the stack so users (me) can customize the behavior.
Update the private GitHubAbuseLimitHandler.isError() logic to handle no Retry-After header as described in the GitHub Docs. This may not be ideal because note everyone is running a long-running project like I am so they may not be OK with a default 1-minute sleep before throwing a RetryRequestException.

Desktop (please complete the following information):

OS: MacOS
JVM: Eclipse Temurin 21
Spring-Boot 3.2.3

Additional context BTW I'm using a GitHub Enterprise User Account accessing GH Enterprise resources so maybe that's why there's no Retry-After header?

Another Idea is related to rateLimitChecker.checkRateLimit(this, request.rateLimitTarget()); inside the GitHubClient.sendRequest method. If it could also pass the GitHub request instead of just the rateLimitTarget then I could configure more smart ratelimit checker which tracks api calls of differing "points" to reduce throughput of the 'create content' requests which GitHub limits much more heavily than Read requests.

bitwiseman commented 6 months ago

The guidance in "about secondary rate limits" is not great. It boils down to "if you make heavy use for our API, you'll probably hit the secondary rate limit sooner or later but we're not going to give you any way to know if you're getting close. That's a you problem." 😡

bitwiseman commented 6 months ago

@ranma2913

Thanks for filing this.

GitHub SHOULD return a retry-after header but it does not. So I've thought of a couple ideas:

Make the GitHubConnectorResponseErrorHandler.isError() method public down the stack so users (me) can customize the behavior.

Not sure why we would want this. The determination of if there is an error is universal and should never need to be customized by users. Whatever you changes you want to do, do it in this library for the benefit of all.

Update the private GitHubAbuseLimitHandler.isError() logic to handle no Retry-After header as described in the GitHub Docs. This may not be ideal because note everyone is running a long-running project like I am so they may not be OK with a default 1-minute sleep before throwing a RetryRequestException.

Yes, do this please. The secondary rate limit is essentially a renamed abuse limit. The consequences are the same. If a client is not okay waiting they can use AbuseLimitHandler.FAIL.

BTW I'm using a GitHub Enterprise User Account accessing GH Enterprise resources so maybe that's why there's no Retry-After header?

Maybe? But we still need to handle the case. The docs actually mention it as a possible scenario.

Another Idea is related to rateLimitChecker.checkRateLimit(this, request.rateLimitTarget()); inside the GitHubClient.sendRequest method. If it could also pass the GitHub request instead of just the rateLimitTarget then I could configure more smart ratelimit checker which tracks api calls of differing "points" to reduce throughput of the 'create content' requests which GitHub limits much more heavily than Read requests.

Whatever smarts you're talking about implementing, it is very likely something that all users would benefit from.

Similar to primary rate limit behavior, basically everyone needs this functionality. Unlike primary rate limits, there is no concrete information that we can provide to the clients that they could use to avoid exceeding the limit. You either stay under the limit or you don't. The guidance provided about how to estimate usage and limits (time from request to response and then totaling that per minute) is best implemented once for the benefit of all.

IgnatBeresnev commented 5 months ago

I've looked at the code, and it seems like there's an inconsistency between GitHubAbuseLimitHandler#isError() and AbuseLimitHandler.WAIT.

AbuseLimitHandler.WAIT already checks for the missing Retry-After header, and sleeps for 1 minute if it's absent:

https://github.com/hub4j/github-api/blob/6b2a487b2800435d614023418197a6300774255d/src/main/java/org/kohsuke/github/AbuseLimitHandler.java#L89-L92

However, I think it will never get to here, because GitHubAbuseLimitHandler#isError() will say it's NOT an error if the Retry-After header is missing:

https://github.com/hub4j/github-api/blob/6b2a487b2800435d614023418197a6300774255d/src/main/java/org/kohsuke/github/GitHubAbuseLimitHandler.java#L31-L33

It looks to me like connectorResponse.header("Retry-After") != null; should be dropped, as the docs indicate that the header is optional

If the retry-after response header is present, you...

It looks like just this change alone should already make it safer when it comes to hitting secondary rate limits.

However, to make it really safe, it looks like we need to do what GitHub suggests:

If your request continues to fail due to a secondary rate limit, wait for an exponentially increasing amount of time between retries..

That is, increasing this one minute sleep exponentially. For this, either changes in the API will be needed (to pass a counter to onError()), or I as a user should be able to add an AtomicLong counter in my own implementation, and increase the sleep duration based on that (I'd rather sleep indefinitely and trigger healthchecks than get banned).

But anyhow, it all seems to start with fixing the check in GitHubAbuseLimitHandler#isError()

If this sounds reasonable, would you be able to fix it, or is it better if I send a PR?

bitwiseman commented 5 months ago

Please submit a PR.