vutran1710 / PyrateLimiter

⚔️Python Rate-Limiter using Leaky-Bucket Algorithm Family
https://pyratelimiter.readthedocs.io
MIT License
334 stars 36 forks source link

Global or local instance of the Limiter? #172

Open phillipuniverse opened 4 months ago

phillipuniverse commented 4 months ago

I'm trying to understand how I need to use / re-use Limiter instances and I'm not quite understanding the behavior. I wrote a little sample multithreading script to evaluate the behavior.

I'm using version 3.6.1.

I'm using Django but it shouldn't be that material to what's going on.

Here's my code:

import threading
import unittest

from pyrate_limiter import Duration, Limiter, Rate, RedisBucket, AbstractClock
from redis import Redis

def api_call(attempt):
    redis_connection = Redis(host='localhost', port=6379, db=0)
    p44_imaging_limiter = Limiter(
        RedisBucket.init([Rate(3, Duration.SECOND)], redis_connection, "project44_api.imaging"),
        max_delay=60 * Duration.SECOND,
    )
    print(f"Attempting API call {attempt}")
    p44_imaging_limiter.try_acquire("call-api")
    print(f"Attempted API call {attempt}")

class TestRateLimiting(unittest.TestCase):
    def test_limiter(self):
        i = 1
        threads = []
        while i <= 10:
            threads.append(threading.Thread(target=api_call, kwargs={"attempt": i}))
            i += 1

        [p.start() for p in threads]
        [p.join() for p in threads]

It does not appear that this is rate limited correctly. Here is some of the logging I'm getting:

[2024-05-07T20:48:34.909645+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 1
[2024-05-07T20:48:34.910342+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 3
[2024-05-07T20:48:34.910546+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 10
[2024-05-07T20:48:34.910933+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 3
[2024-05-07T20:48:34.911431+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 7
[2024-05-07T20:48:34.911742+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 1
[2024-05-07T20:48:34.912018+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 10
[2024-05-07T20:48:34.912515+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 5
[2024-05-07T20:48:34.912716+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 2
[2024-05-07T20:48:34.912781+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 9
[2024-05-07T20:48:34.912841+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 4
[2024-05-07T20:48:34.913220+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 6
[2024-05-07T20:48:34.913328+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:28] - Attempting API call 8
[2024-05-07T20:48:35.966483+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 2
[2024-05-07T20:48:35.967252+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 8
[2024-05-07T20:48:35.967461+00:00] | ERROR    | [pyrate_limiter] [limiter.py:_handle_reacquire:155] - 
                Re-acquiring with delay expected to be successful,
                if it failed then either clock or bucket is probably unstable

[2024-05-07T20:48:35.967789+00:00] | ERROR    | [pyrate_limiter] [limiter.py:_handle_reacquire:155] - 
                Re-acquiring with delay expected to be successful,
                if it failed then either clock or bucket is probably unstable

[2024-05-07T20:48:35.967994+00:00] | ERROR    | [pyrate_limiter] [limiter.py:_handle_reacquire:155] - 
                Re-acquiring with delay expected to be successful,
                if it failed then either clock or bucket is probably unstable

[2024-05-07T20:48:35.968204+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:30] - Attempted API call 6
[2024-05-07T20:48:35.970266+00:00] | ERROR    | [pyrate_limiter] [limiter.py:_handle_reacquire:155] - 
                Re-acquiring with delay expected to be successful,
                if it failed then either clock or bucket is probably unstable

Doing some light debugging it appears that the threads are going to sleep as they get rate limited and then when they wake back up they expect to grab the item from the bucket but can't.

If I set the Limiter as a module level variable then things seem to get rate limited, it's a chunky 3 requests per second:

import threading
import unittest

from pyrate_limiter import Duration, Limiter, Rate, RedisBucket, AbstractClock
from redis import Redis

redis_connection = Redis(host='localhost', port=6379, db=0)
p44_imaging_limiter = Limiter(
    RedisBucket.init([Rate(3, Duration.SECOND)], redis_connection, "project44_api.imaging"),
    max_delay=60 * Duration.SECOND,
)

def api_call(attempt):
    print(f"Attempting API call {attempt}")
    p44_imaging_limiter.try_acquire("call-api")
    print(f"Attempted API call {attempt}")

class TestRateLimiting(unittest.TestCase):
    def test_limiter(self):
        i = 1
        threads = []
        while i <= 10:
            threads.append(threading.Thread(target=api_call, kwargs={"attempt": i}))
            i += 1

        [p.start() for p in threads]
        [p.join() for p in threads]

Output:

[2024-05-07T20:54:58.158751+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 1
[2024-05-07T20:54:58.159001+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 2
[2024-05-07T20:54:58.159094+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 3
[2024-05-07T20:54:58.159174+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 4
[2024-05-07T20:54:58.159271+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 5
[2024-05-07T20:54:58.159360+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 6
[2024-05-07T20:54:58.159424+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 7
[2024-05-07T20:54:58.159494+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 1
[2024-05-07T20:54:58.159608+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 8
[2024-05-07T20:54:58.159687+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 9
[2024-05-07T20:54:58.159759+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:18] - Attempting API call 10
[2024-05-07T20:54:58.159954+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 2
[2024-05-07T20:54:58.160357+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 3
[2024-05-07T20:54:59.217019+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 4
[2024-05-07T20:54:59.219066+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 5
[2024-05-07T20:54:59.220607+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 6
[2024-05-07T20:55:00.267008+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 7
[2024-05-07T20:55:00.268272+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 8
[2024-05-07T20:55:00.269378+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 9
[2024-05-07T20:55:01.316729+00:00] | INFO     | [pyratelimiter_test] [test_limiter.py:api_call:20] - Attempted API call 10

My goal is to understand how this code will behave across a cluster. My design is to have many containers that are all trying to contend with hitting the same external API, but that external API has a rate limit. I would like workers to sleep when they are rate limited until they can make the API call

phillipuniverse commented 4 months ago

My problem also feels very similar to https://github.com/vutran1710/PyrateLimiter/issues/128

vutran1710 commented 4 months ago

I see your point, but right now i'm quite busy with company stuff so this has to wait.

phillipuniverse commented 4 months ago

Thanks @vutran1710, appreciate the work you put into this library!

My guess is that the resolution of https://github.com/vutran1710/PyrateLimiter/issues/160 would also fix my issues, seems pretty similar to what I'm seeing.

Also, I updated the test case to make it easier to reproduce by removing all the Django pieces.