Change Rate Limiter state dynamically based on external value

Hi,

I recently started using this library. Please my apologies if this is somewhere in the docs and I missed it.

My situation:

I need to make requests to an external API.
The API has a Rate Limit of 60 req/min.
The API response includes headers with the rate limit information on max, used, and remaining requests.

Problem:

When the application starts, the "remaining" value is unknown until the first response is received. For this reason, I cannot configure a RateLimiter properly.

Question:

Is there a way to update the state of the rate limiter dynamically? I'd like to, on every request/response loop, check for the rate limiter headers and update the rate limiter accordingly.

Would the above be possible, or is there any other way to achieve the same result?

I was also considering making an initial request to the API to get the value before instantiating the Rate Limiter.

Thanks!

Best, Jose.-

Hi Jose -

There's nothing like this for RateLimiter yet, but I'd be happy to add something if it makes sense. One approach would be to let you provide a rate limiting function that compute the rate limit each time an execution is attempted, similar to how dynamic delays work with retries:

RateLimiter<HttpResponse> limiter = RateLimiter.smoothBuilder(ctx -> {
  int rate = Integer.valueOf(ctx.getLastResult().getHeader("rate"));
  return Rate.of(rate, Duration.ofSeconds(1));
}).build();

Another alternative would be to allow you to simply set a new rate limit against the policy:

rateLimiter.setRate(10, Duration.ofSeconds(1));

Do you have a preference either way?

It would be good to know, how often do you see the rate changing? Will it change on each execution? Will the maxExecutions and the period change (from the docs), or just one of them?

One of the complexities with changing a rate limiter that's in-use is some threads may be waiting to acquire a permit (sleeping) based on the current rate limiter configuration, and when the configuration changes they may need to wait even longer, or they may actually wait longer than needed.

Hi Jonathan,

Thanks for the quick reply.

I quite like the idea of updating the policy and setting a new rate limiter. This could work pretty well with the flow I have in mind (an OkHttp network interceptor that checks the value of the headers and updates the policy accordingly).

To answer your question, the rate will remain consistent. It will always allow 60 requests per second. However, as mentioned before, the problem is that you never know how many requests you have used in that 60-second window before the first request (application start).

Of course, this is all for a single-instance application. If you have more than one instance running, you'd definitely need to check the response headers to determine how many you've got left. In this situation, being able to update the rate limiter dynamically becomes extremely important.

For existing threads waiting to acquire a permit to execute, I think that could be avoided by ensuring the tryAquirePermit is triggered with a maxWaitTime, correct?

Thanks, Jose.-

To answer your question, the rate will remain consistent. It will always allow 60 requests per second. However, as mentioned before, the problem is that you never know how many requests you have used in that 60-second window before the first request (application start).

So it sounds like on the server side the accepted rate will be constant, but on the client side the response header will be constantly changing? Presumably this means we may need to constantly update the RateLimiter to react to what the server is reporting, or were you thinking the RateLimiter would only occasionally be updated?

Maybe you could describe how you see it being used. For example, create the RateLimiter with an initial rate of 60 requests per second. Next, you might receive a response that says only 40 requests are remaining. We could update the rate limiter to 40 requests per second at that point, but that's not exactly accurate, since it's really 40 requests left in whatever time period the server is tracking until it resets. In this case does the server provide something like X-Rate-Limit-Reset to indicate how long until the limit resets back to 60? We could use that information to try and get the client side RateLimiter in sync with the server, except as you mention below, having multiple clients means we may never fully be in sync.

Of course, this is all for a single-instance application. If you have more than one instance running, you'd definitely need to check the response headers to determine how many you've got left. In this situation, being able to update the rate limiter dynamically becomes extremely important.

Definitely. With multiple clients our client side rate limiter is eventually consistent with the server at best. So there's always a chance we'll attempt a request that the server rejects due to rate limiting. In this case, if a server is already going to reject client side requests, then I wonder what the goal of a client side rate limiter is: to evenly spread out requests (smooth)? I'm not sure a bursty rate limiter is useful since the server already performs that job for us (though I haven't thought about it much yet).

For existing threads waiting to acquire a permit to execute, I think that could be avoided by ensuring the tryAquirePermit is triggered with a maxWaitTime, correct?

Yep, but they still could end up waiting longer than necessary if the RateLimiter is reconfigured to allow a greater rate. The simplest thing to do is accept that any waiting threads based on previous RateLimiter configuration are left alone.

So it sounds like on the server side the accepted rate will be constant, but on the client side the response header will be constantly changing? Presumably this means we may need to constantly update the RateLimiter to react to what the server is reporting, or were you thinking the RateLimiter would only occasionally be updated?

Let's imagine the following scenario:

The application starts with a RateLimiter configured with 60 req/sec in burst mode.
Failsafe starts making requests and executes 30 requests in under 60 seconds.
The application gets restarted (failure, shutdown, re-deployed, etc.).
The RateLimiter is started again with a configuration of 60 req/sec.

As you can see here, once the application is restarted the RateLimiter configuration will be wrong because there are ~30 requests remaining in that window.

The response obtained by the target API is the one that has the source of truth.

Maybe you could describe how you see it being used. For example, create the RateLimiter with an initial rate of 60 requests per second. Next, you might receive a response that says only 40 requests are remaining. We could update the rate limiter to 40 requests per second at that point, but that's not exactly accurate, since it's really 40 requests left in whatever time period the server is tracking until it resets. In this case does the server provide something like X-Rate-Limit-Reset to indicate how long until the limit resets back to 60? We could use that information to try and get the client side RateLimiter in sync with the server, except as you mention below, having multiple clients means we may never fully be in sync.

Yes, that's exactly how I see it being used. Using an interceptor that checks the response headers (max, used, and remaining requests) and updates the RateLimiter accordingly. This approach could also work when using multiple instances of the application against the same target API. There will be a race condition, for sure, but those requests will fail and the RateLimiter updated with the rate HTTP headers provided by the API response.

Definitely. With multiple clients our client side rate limiter is eventually consistent with the server at best. So there's always a chance we'll attempt a request that the server rejects due to rate limiting. In this case, if a server is already going to reject client side requests, then I wonder what the goal of a client side rate limiter is: to evenly spread out requests (smooth)? I'm not sure a bursty rate limiter is useful since the server already performs that job for us (though I haven't thought about it much yet).

Perhaps RateLimiter is not what I need, and I should be using a CircuitBreaker to achieve the desired result. Rather than updating a RateLimiter, I could update the state of the CircuitBreaker to close when the remaining requests have reached 0, and sleep for a custom Duration. Do you think this would be a better approach?

I might have tried with a RateLimiter because, at first glance, it sounds like the most natural option. However, the rate information is returned by the target API and all my client needs to do is open/close the gate to allow more requests to go through.

The other thing that I forgot to mention is that a failed request, because the rate has been exceeded (i.e. response code 429), also counts as a request. Meaning that if the Retry policy does not use, for instance, an incremenetal backoff, it will never recover and the API response will continue to report "0 remaining requests".

Perhaps RateLimiter is not what I need

That's what I'm wondering since the server is already doing the rate limiting for you. The client just needs to follow the response from the server.

Do you think this would be a better approach?

Yea, I think either a CircuitBreaker or RetryPolicy by themselves might be better. A CircuitBreaker will reject all requests until the breaker is ready to open again. A RetryPolicy would retry them, else fail eventually.

With either policy, you'd start by handling a 429 response:

handleResultIf(response -> response.getStatus() == 429)

If your server returns something like an X-Retry-After header that tells you exactly how long until requests are accepted again, you can use that as the delay:

withDelayFn(ctx -> Duration.ofSeconds(ctx.getLastResult().getHeader("X-Retry-After")))

Else you'd have to guess at some other delay. For a CircuitBreaker this could be some fraction of the rate limiting period:

withDelay(Duration.ofMillis(500))

For a RetryPolicy this could be a backoff:

withBackoff(Duration.ofMillis(10), Duration.ofSeconds(1))

When using a RetryPolicy, If the request rate is consistently higher than the server will accept, you could end with too many queued retries, so you'd either want to set a withMaxRetries to fail a request at some point:

withMaxRetries(3)

Whether to use CircuitBreaker or RetryPolicy is up to you.

Hi Jonathan,

Apologies for the late reply but life got in the middle.

In the end, I opted for doing the following:

Configure Failsafe with both a RetryPolicy and a CircuitBreaker.
Configure the HttpClient (OkHttp) with a network interceptor that acquires a permit from the circuit breaker before actually executing the request.
Once the permit is acquired, the request is executed and the HTTP status code is checked for a TOO_MANY_REQUESTS (429) response code.
In case the response is indeed a 429, it opens the circuit breaker and records a failure.
In case the response is successful, it records a success.

This approach seems to be working quite well for now.

Thanks a lot for your input on this issue – much appreciated.

You can close the issue now.

failsafe-lib / failsafe

Change Rate Limiter state dynamically based on external value #320