coveooss / spillway

A simple, distributed and flexible rate limiter
MIT License
34 stars 12 forks source link

Sliding Window storage #23

Closed Sytten closed 8 years ago

Sytten commented 8 years ago

To avoid big spikes when buckets expire, I wrote a new backend with sliding window bucket. The idea is to divide the limits in fixed-size slices and add them to calculate the total of the window. I must admit that the framework wasn't designed for it and I had to do some tricks. Still I think it is robust and should handle our needs.

The only thing missing is performance. I see a couple of bottlenecks in the InMemorySlidingStorage specially for the computation of the sliding window. As for the network, I reused most the async batch backend. The only difference is that I limited the number of sync slices to avoid to much network load. I decided to only sync the current slice (if the slice time is greater than the sync time) or the last slices (if the slice time is less than the sync time).

GuiSim commented 8 years ago

To avoid big spikes when buckets expire What causes the spike?

Is it because Redis doesn't like having multiple keys expiring at the same time? Or is it because we end up creating a bunch of new keys all at the same time?

Sytten commented 8 years ago

Actually no, the problem is more that all buckets expire at the same time. So clients doing while(1) all start to smash on the platform at the same time. With the sliding window a client hammering our infra will never be able to reach the platform since he will always be blocked before as the bucket are never really reset totally, they just decrease over time.

GuiSim commented 8 years ago

Actually no, the problem is more that all buckets expire at the same time.

Wouldn't a simpler be solution be to add a small offset to the bucket, based on the hash of the key for example?

malaporte commented 8 years ago

If I remember correctly we discussed this at the start of the project, and in the end we decided that it wasn't needed because you can protect against "spikes" using a limit on a short interval (ex: max X events / sec) combined with a limit on a larger bucket such as X events / hour.

I do think having a true sliding window does give a "smoother" behavior but I'm hard pressed in finding real use cases where it's really important. Thoughts?

Sytten commented 8 years ago

@GuiSim It could help to do that. It was another proposed solution. @malaporte Yes, but I tested it and it didn't work that well. I didn't found a way to allow normal usage and still have a small enough limit so that one computer can exceed the limit. Plus @pastjean needed it so he might talk more about it.

Sytten commented 8 years ago

We have a meeting this afternoon to see which implementation we really need and if we need both, how to integrate them better. I also agree that it is getting quite complex and it was not design for it...

Sytten commented 8 years ago

After thought I put this feature on hold for now. I will do some bugfix and release a V1.0.1. If we ever need it we will have the code.