[QUESTION] Using governor for HTTP Rate Limiting weirdness

Unaidedsteak commented 4 years ago

Hello!

Thanks for writing and providing this library, it's been a big help! I had a few questions about how it works and if you had any suggestions to my problem.

It seems on first start of using this ratelimiter, it seems to allow more requests than the defined rate limit. e.g limit of 10 rp/s and for the first few seconds of sending requests, around 15ish rp/s make it though.
One idea I had was to allow "hot configuration" of the rate limiter so I can change the ratelimit value on the fly. I'm currently doing this by creating a new RateLimiter with the new quota and using a RwLock to ensure all threads can safely use the new rate limiter. Similar to issue 1, when this happens there's a large amount of extra rp/s make it through on top of the rate limit.

Extra Context:

Using RateLimiter::direct(Quota::per_second(NonZeroU32::new(10) to create the RateLimit
Using bucket..check() to check if I should process a request, if the bucket is full I immediately respond with a 429

I appreciate this could just be due to my implimentation, just wondered if you had any ideas. Thanks!

antifuchs commented 4 years ago

Hi! Thanks for the kind words, I'm glad you find this library useful (:

It seems on first start of using this ratelimiter, it seems to allow more requests than the defined rate limit. e.g limit of 10 rp/s and for the first few seconds of sending requests, around 15ish rp/s make it though.

I suspect you're running into the default for burst capacity on an "empty" rate limiter. By default, rate limiting quotas are created with a burst size of however many units are allowed through per time unit (e.g., a limiter allowing 10 rps through has a burst size of 10). That means that a "fresh" rate limiter allows 10 requests through immediately, all without pause. That would explain the skewed average counts you report: Once burst size is used up, clients will have to wait; but by then there have been a possibly-unfair 10 clients who got their request served immediately.

If you want to limit the burst capacity & shape the traffic on un-exercised rate limiters, you can use the .allow_burst method on Quotas: By setting this to 1, you can effectively turn off burst capacity, ensuring a very smooth average.

In case you're wondering why burst capacities work the way they do with regards to check_n, https://github.com/antifuchs/governor/issues/16#issuecomment-589497803 and the follow-up comments are good places to read up on the history.

One idea I had was to allow "hot configuration" of the rate limiter so I can change the ratelimit value on the fly. I'm currently doing this by creating a new RateLimiter with the new quota and using a RwLock to ensure all threads can safely use the new rate limiter. Similar to issue 1, when this happens there's a large amount of extra rp/s make it through on top of the rate limit.

Oooh, that's an interesting idea. I've been thinking about something along the same lines, but haven't gotten around to implementing a reconfiguration scheme yet - it would probably need a lock, but can likely be made pretty lightweight. The way you do it right now with an RwLock seems like the best way for now, if you need the ability to reconfigure.

Depending on your traffic shaping requirements, you might need a way to carry the last rate limiting state over: That's currently unimplemented, but could probably be done. Let me know if that's something your system needs (:

Hope that answers your questions!

Unaidedsteak commented 4 years ago

Hey! Thanks so much for getting back to me :)

I can confirm, using .allow_burst(NonZeroU32::new(1)) has actually seemed to have totally fixed both of my issues!

In regards to your last point, I think that would be very useful, especially once I get down the route of using a Keyed store to rate limit based on IP or a header. I assume the intention would be to use something similar to the into_state_store(self) which already exists, but then just provide a way to construct a new rate limiter using that store?

antifuchs commented 4 years ago

I assume the intention would be to use something similar to the into_state_store(self) which already exists, but then just provide a way to construct a new rate limiter using that store?

That's about right, yeah: The idea was to at least provide a mildly plausible way to surmount the 500-ish-year useful lifetime of a rate limiter, but I never fully followed through on this. I'll see if I can add something like this soon.

If you don't mind my asking, can you give me details on your use case for adjusting the parameters of a rate limiter? How often do you expect them to change, and how long do you expect your processes to live?

Unaidedsteak commented 4 years ago

If you don't mind my asking, can you give me details on your use case for adjusting the parameters of a rate limiter? How often do you expect them to change, and how long do you expect your processes to live?

So I'm building a simple HTTP ratelimiter, the idea is that you'd have multiple rate limiters (seperate binaries in this case) spread across a bunch of machines that you could scale up and down, they report into a controller with an api / frontend. The use case is to let people dynamically update the rate limit and upstreams without (within reason) reconfiguring and restarting the individual binaries, basically giving you near realtime traffic control. I'd like to be able to treat them like your average nginx loadbalancer setup, a set of long lived constant set of rate limiters and then the ability to scale up and down as needed

antifuchs commented 4 years ago

Wow, that's a super cool use case! I'll try and think of something. If you come up with any cool techniques that would help you, feel free to drop ideas here (:

lytefast commented 3 years ago

As someone else using this library on production services, I would also love to have this feature as well. Dynamically changing rate limits is something we need for our service during issues.

Another feature that I would love would to have would to "peek" at the current rate limits. E.g. I want to see the top keys that are being rate limited. This information is useful for debugging. The only way this data is exposed now is by consuming/destroying via the RateLimiter.into_state_store call and recreating as was mentioned. This is not ideal, and requires a bit more RW locking given I just really need an estimate, so races aren't a large factor here.

Happy to open up PRs to help along these lines. It seems like exposing the internal StateStore (as an iterator) would be sufficient here. However it would require the caller understand the underlying implementation. Which I believe len already requires: https://docs.rs/governor/0.3.2/governor/struct.RateLimiter.html#method.len

antifuchs commented 3 years ago

Sorry I took so long to reply to this - I've been thinking about this, though (I'm not finished thinking about it, but I think I can checkpoint here where my thoughts are at so far).

There are a few reasons why I'm reluctant to just go ahead & enable changing the parameters on a rate limiter.

The minor reason is that it's hard to come up with a way to do so that's going to treat the existing state in a fair way with the GCRA algorithm. That is, even when you're changing the limits to be more permissive means that anyone currently seeing rate-limiting will continue to do so, until their next request is acceptable again. That might be fine, though.

The major reason this is difficult is that changing the parameters of a rate limiter is tricky in parallel with regular rate-limiter usage. The way it currently works is that you can't change them, so we don't have to have any concurrency guards in place, no mutexes or even atomic integers representing those parameters. I guess we could do that, but then every rate-limiting access would require a lock, or another read of an atomic variable; I'm somewhat reluctant to unilaterally impose that on folks who might not require parameters to change.

Of course, as you notice, thet .into_state_store way is not super viable either: While that is running, you have an RW lock and the limiter is useless for concurrent usage.

So, I guess we could add the definition of those parameters (defaulting to immutable, but with the possibility to exchange for another implementation, e.g. mutex-guarded values, or even accesses a database with an in-memory cache) as a generic argument. Would that be interesting to you all?

Unaidedsteak commented 3 years ago

That sounds cool for sure! For what its worth, my project was purely for fun and not used in the real world (as of yet 😉) so I wouldn't be too concerned about implementing anything if nobody else wants it or it imposes / impacts other users 😄

boinkor-net / governor

[QUESTION] Using governor for HTTP Rate Limiting weirdness #39