Open sio opened 5 years ago
@sio would it be worth submitting your sliding window work as a PR?
@stuaxo, in my implementation I made some design decisions differently, so adding sliding window logic to this project would require more work than just copy-paste.
If someone is willing to do it, I'm all in favor. Though my explicit approval is not necessary - I've already published the code under Apache License that allows it.
Did this functionality get merged in the release?
The most recent commit to master is older than this thread. That's definitely a no :-)
Another factor that can contribute to violating a rate limit is losing the state of RateLimitDecorator
between application restarts. During development this happens to me often, and there is currently no way to safely resume without waiting the full time limit after that last request.
python test.py > log1.txt && python test.py > log2.txt
Running parallel processes illustrates this a little more clearly:
python test.py > log1.txt & python test.py > log2.txt &
If the call log was implemented as an sqlite database, the operations would be relatively simple queries and the state would not only be safe across restarts, but also be safe for parallel processes.
https://github.com/tomasbasham/ratelimit/compare/master...deckar01:31-db-log
python test.py > log1.txt & python test.py > log2.txt &
I wish I had time to play at fixing this but I know the tickets at work mean it wont happen.
Having a callback to store state in the db (or ideally in Redis) would sort this.
No worries. I went ahead and published my fork to make my proposal easier to test drive. If there is anything I can do to help resolve this issue, just let me know.
+1, use slidingwindow is the right solution.
Sqlite is not suit for multi-processing, use redis zadd and zremrangebyscore can be better
From what I've gathered there exists a risk of violating the rate limit when using this module.
As of now
RateLimitDecorator
does not calculate a sliding time frame to track number of API calls within any given period of specified length. Instead we track some time periods of correct length, placed back-to-back almost arbitrarily (depending on the time of first API call).Here is how it can bite us.
Let's say we must not make more than 5 calls within a minute. We make a call, then wait almost a minute, and make 9 more API calls consecutively. There are only 5 calls within the first minute and 5 calls within the second.
RateLimitDecorator
will allow such behavior while the API provider will definitely consider it to be a violation.Here is the code that demonstrates described violation. As in the picture above we have the limit of 5 calls per time frame (10 seconds for faster demo). But we make 9 calls within the last two seconds. API provider would not be happy with us :)
To avoid this we need to track the time of each API call and adjust the decision making logic accordingly.
For my personal use I've written an alternative class altogether, but I do not plan on uploading it to PyPI and fragmenting the user base. You have done a great job of maintaining this package for a long time and I think it should remain the go-to place for rate limiter logic. Feel free to reuse my code though.