I have introduced this limiter in my app but what I see as a consequence is that requests simply respond slowly after rate limit is hit waiting for their turn. Now that is fine and rate limit is properly applied but what if I would like to timeout request waiting in queue after 2s?
In ideal world I would like to throw 429 'Too Many Requests' when some timeout of waiting in queue is being hit.
This would preserve resources on the server (HTTP connections) and the client could implement retry logic after backing up a bit.
Any ideas how to do it based on this lib?
In below example I have set the rate limit to be 20 req/s
The load test runs with 20 users at rate 1 req/s per user. The moment I will cross 20 rps resp blows up but there are no errors despite
user waiting for response for 6s or longer.
I can do
if rate_limiter.has_capacity():
await rate_limiter.acquire()
# do stuff here
else:
return web.json_response(
data={'Rate limit failure': 'Too many requests are sent to service.'},
status=HTTPStatus.TOO_MANY_REQUESTS,
)
But then this is a really hard limit and I wondered if I could avoid it by applying some soft queue so that request would be served in next second in this example but with increased latency.
Update:
I guess the easiest way to achieve what I want is:
try:
# Wait for capacity with a timeout (defaulting to 2 seconds)
await asyncio.wait_for(rate_limiter.acquire(), timeout=2)
# do stuff
except asyncio.TimeoutError:
return web.json_response(
data={'Rate limit failure': 'Too many requests are sent to service.'},
status=HTTPStatus.TOO_MANY_REQUESTS,
)
I have introduced this limiter in my app but what I see as a consequence is that requests simply respond slowly after rate limit is hit waiting for their turn. Now that is fine and rate limit is properly applied but what if I would like to timeout request waiting in queue after 2s?
In ideal world I would like to throw 429 'Too Many Requests' when some timeout of waiting in queue is being hit. This would preserve resources on the server (HTTP connections) and the client could implement retry logic after backing up a bit.
Any ideas how to do it based on this lib?
In below example I have set the rate limit to be 20 req/s
The load test runs with 20 users at rate 1 req/s per user. The moment I will cross 20 rps resp blows up but there are no errors despite user waiting for response for 6s or longer.
I can do
But then this is a really hard limit and I wondered if I could avoid it by applying some soft queue so that request would be served in next second in this example but with increased latency.
Update: I guess the easiest way to achieve what I want is: