Some of the documentation is misleading and contradictory

gregormaclaine commented 4 months ago

I have looking through this as I need something to control the rate of requests I make to a rate-limited API. Initially this seemed to be a correct fit for my problem, but after looking deeper into the documentation and after some testing, I've concluded otherwise.

The Issue

Here is the top example from the README:

from aiolimiter import AsyncLimiter

# allow for 100 concurrent entries within a 30 second window
rate_limit = AsyncLimiter(100, 30)

async def some_coroutine():
    async with rate_limit:
        # this section is *at most* going to entered 100 times
        # in a 30 second period.
        await do_something()

The statement that "this section is at most going to be entered 100 times in a 30 second period" is a false statement due to the nature of the algorithm.

Example

This code is taken from the bursting section of the documentation:

import asyncio, time
from aiolimiter import AsyncLimiter
limiter = AsyncLimiter(4, 8)
async def task(id):
    await asyncio.sleep(id * 0.01)
    async with limiter:
        print(f'{id:>2d}: Drip! {time.time() - ref:>5.2f}')

tasks = [task(i) for i in range(10)]
ref = time.time(); result = asyncio.run(asyncio.wait(tasks))

This code produces the following output:

 0: Drip!  0.00      #  /\
 1: Drip!  0.01      #  |
 2: Drip!  0.02      #  |
 3: Drip!  0.03      #  |- All happen within an 8 second window
 4: Drip!  2.05      #  |
 5: Drip!  4.05      #  |
 6: Drip!  6.05      #  \/
 7: Drip!  8.06
 8: Drip! 10.06
 9: Drip! 12.07

This is taken directly out of the documentation yet explicitly contradicts one of the statements in the README. Therefore since the documentation is correct, there should be explicit mention that this code does not guarantee a certain number of requests in a given time window.

Use for Rate Limiting

This is why I found this wouldn't work as a solution for my issue as I need something that will strictly prevent more than a certain number of requests being sent out within a one minute window. The reason I mention this alongside the issue itself is that it seems like other people have run into the same misunderstanding as me.

Issue #90 runs into the same problem where they are trying to use this to pretty much the same problem as me, and Issue #200 was also due to the misunderstanding. I would also expect a couple of the other issues to stem from this problem too.

With this in mind, I also suggest that the README should contain a note about rate limiting, mentioning this as a possible dealbreaker for certain problems.

Am I missing anything?

I understand that according to the documentation this bursting property of the implentation is expected and the correct result. However, I would like to ask if there is a way to use this to accomplish my problem? Specifically, is there a way to use this that does guarantee a maximum number of tasks occur in a given time window?

mjpieters commented 3 months ago

The statement that "this section is at most going to be entered 100 times in a 30 second period" is a false statement due to the nature of the algorithm.

Right, so the wording can be updated, perhaps. The nature of the leaky-bucket algorithm is that it frees capacity at a given rate, but filling the bucket can happen at any rate until the bucket is full. I'll see if I can make that clearer.

The documentation is already telling you how to do what you want: set the capacity to 1 and control the rate with the time_period argument. If you are working from an external rate specification like '30 requests in 1 minute', just convert that value to a time-period per request, here that's 60 seconds / 30 requests == 2 seconds per request.

jsjohnst commented 2 weeks ago

but filling the bucket can happen at any rate until the bucket is full

I beg to differ, the code still blows past the full state consistently in the beginning. Let's say I want to rate limit to a max_limit (your terminology) of 20 in a time_period of 60 seconds. Using your code, it will always burst past 20 in the beginning peaking around 30-35, then stabilize out to 20. If it's supposed to be a max_limit then it needs to be a max limit, otherwise call it a "suggested high point" or some other soft language which excuses the fact your code doesn't do what it says on the tin, aka rate limit to an actual max.

mjpieters commented 2 weeks ago

I beg to differ, the code still blows past the full state consistently in the beginning. Let's say I want to rate limit to a max_limit (your terminology) of 20 in a time_period of 60 seconds. Using your code, it will always burst past 20 in the beginning peaking around 30-35, then stabilize out to 20.

No, it doesn't. The bucket can't go past 20, but there is no limit on how quickly an empty bucket can fill. After 20 tasks have passed the gate it is full but it then 'leaks' capacity every 20 / 60 == 3rd of a second. So after that 3rd second one task can pass the limiter, then another can pass another 3rd of a second later. You are counting those initial 20 in your estimation of the overall rate here, but that's not how the leaky bucket algorithm works.

If that doesn't meet your needs than you perhaps need a different kind of rate limiter.

mjpieters / aiolimiter