Closed raratiru closed 2 years ago
Yes, this should be fixed by #61.
Version 2.6.3 fixed the sum of the requests from both terminals (in case they run simultaneously).
However the error about the locked database pops up and disrupts (or most commonly blocks) the simultaneous run of the terminals.
The error on a "Debian GNU/Linux 11 (bullseye)" box (expires on May 2022): https://pastebin.com/SHhLnSin
In case it helps, the above code works without problems in the deckar01 fork of ratelimit
Okay, I see. #61 mainly made the SQLite bucket thread-safe. It's also now compatible with ProcessPoolExecutor
, which spawns child processes from one parent process.
I just realized that dealing with multiple unrelated processes, like in this example, is a separate (and more difficult) problem.
Thank you for the prompt answer.
In my case, I am trying to replicate the occasion where a cron job (for example) initiates a process while the previous one has not yet finished.
That makes sense. I think a file-level lock like py-filelock will be the only way to make the SQLite backend work correctly from multiple scripts, while also supporting all the existing features of this library (mainly, tracking several different rate limits at once). I'll look into that.
It appears that in the issue you linked, that library is working around the problem by catching and ignoring OperationalError
: https://github.com/deckar01/ratelimit/blob/master/ratelimit/decorators.py#L120
It's also using a different algorithm (sliding window), so it has a fewer database operations and in a different order, which may make multiprocess errors less likely to come up. I don't believe that eliminates the problem, though; with a faster request rate and/or leaving your scripts running for long enough, they will eventually make more requests than they're supposed to. Either way, the same approach wouldn't work for this library.
Oh! The code seems to be a kind of soft hack around the issue.
As a matter of fact that fork was created to apply the sliding window algorithm and try to solve the same very issues for the apparently unmaintained parent library.
Indeed py-filelock
could prove to be py-substantially helpful!
I think I have this working in my branch for #68.
Here's a slightly modified example to test this:
import sys
from time import perf_counter as time
from pyrate_limiter import Limiter, RequestRate, Duration, FileLockSQLiteBucket
limiter = Limiter(
RequestRate(100, Duration.SECOND),
RequestRate(1000, Duration.MINUTE),
bucket_class=FileLockSQLiteBucket,
)
@limiter.ratelimit("test", delay=True)
def send_request():
"""Rate-limited function representing a single request"""
def test_ratelimit(n_process=0):
"""Continuously send rate-limited requests"""
start_time = time()
n_requests = 1
while True:
send_request()
print(f"[Process {n_process}] {time() - start_time:.2f}: {n_requests}")
n_requests += 1
if __name__ == "__main__":
n_process = sys.argv[1] if len(sys.argv) > 1 else 0
test_ratelimit(n_process)
Either run from separate terminals, or as background processes:
python filelock_test.py 1 &
python filelock_test.py 2 &
python filelock_test.py 3 &
...
# Kill background processes when done:
pkill -f filelock_test.py
I ran this for several minutes with 0 extra requests. And I added an automated test that basically does the same thing, except with multiprocess.Process
(but no shared Limiter
object).
@raratiru @spiffspaceman
FileLockSQLiteBucket
is now available in v2.7. Give that a try and let me know if that works for you.
@JWCook Thank you, it works great!
yes i ran it few times, seems to be fine . Thanks
When calling each one of the following functions from different REPLs, an error occurs
OperationalError: database is locked
. Once, it worked (but did not last after the delay) and the total requests before the first delay for both REPLs were 4075 (+75).Is it relevant to #61 ?