yanyongyu / githubkit

The modern, all-batteries-included GitHub SDK for Python, including rest api, graphql, webhooks, like octokit!
MIT License
158 stars 21 forks source link

Feature: auto retry when rate limit exceeded or server error encountered #86

Closed schelv closed 5 months ago

schelv commented 5 months ago

This PR introduces a mechanism that limits the number of concurrent requests. Additionally, if a RateLimitExceeded response is encountered, new requests are not started for a while. I'm not sure if halting the new requests is really needed, but it seems the right thing to do.

The print statements are there currently for seeing that the mechanism works. Should they be replace with a logger? or removed entirely?

Related to #66

codecov[bot] commented 5 months ago

Codecov Report

Attention: 32 lines in your changes are missing coverage. Please review.

Comparison is base (4406cc0) 35.23% compared to head (a76f8ef) 35.25%. Report is 1 commits behind head on master.

Files Patch % Lines
githubkit/core.py 42.85% 16 Missing :warning:
githubkit/retry.py 63.88% 13 Missing :warning:
githubkit/config.py 66.66% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #86 +/- ## ========================================== + Coverage 35.23% 35.25% +0.01% ========================================== Files 2414 2416 +2 Lines 124893 124971 +78 ========================================== + Hits 44007 44053 +46 - Misses 80886 80918 +32 ``` | [Flag](https://app.codecov.io/gh/yanyongyu/githubkit/pull/86/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Ju4tCode) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/yanyongyu/githubkit/pull/86/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Ju4tCode) | `35.25% <61.44%> (+0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Ju4tCode#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

yanyongyu commented 5 months ago

It seems concurrent restriction should consider both sync/async usage and multi-instance usage. The concurrent restriction mechanism may need external service like redis. Currently, i would like to add a simple retry logic when rate limit exceeded or server error encountered.

schelv commented 5 months ago

It seems concurrent restriction should consider both sync/async usage and multi-instance usage. The concurrent restriction mechanism may need external service like redis. Currently, i would like to add a simple retry logic when rate limit exceeded or server error encountered.

Considering both sync and async usage should be possible. I will try to add that. Taking multi-instance usage into consideration at the mechanism level seems a lot more complex. A simplistic way to "consider multi-instance usage" without hitting too many rate limits, is by limiting the number of concurrent connections for each of the instances. Alternatively the number of concurrent connections could be lowered when approaching the primary rate limit. This would leave some requests for other instances. But I'm not sure if this is something that should be handled in this mechanism.

The benefit also depends a lot on the what each of the instances is doing with the api. Do you have a scenario/use-case in mind with multiple instances?

yanyongyu commented 5 months ago

multiple instances are commonly used in multi-process mode or cluster mode. octokit also uses redis to schedule the requests in cluster mode. Maybe an abstract storage layer should be implemented to cache infos and restrict the concurrency. I will try to implement an in-memory storage and a redis one.

schelv commented 5 months ago

multiple instances are commonly used in multi-process mode or cluster mode. octokit also uses redis to schedule the requests in cluster mode. Maybe an abstract storage layer should be implemented to cache infos and restrict the concurrency. I will try to implement an in-memory storage and a redis one.

The advised "best practice" is to not do anything concurrently at all. Is that something that you want to have? It sounds very very slow.

yanyongyu commented 5 months ago

octokit uses the bottleneck library to schedule the request job. you can see the rate limit logic in the library.

yanyongyu commented 5 months ago

Due to the complexity of this feature, I'm going to split it into two PRs. In this pr, rate limit auto retry will be implemented and the concurrency limit will be implemented in the next version