Have different way to enforce "rate limiting" vs "quotas"

malaporte commented 7 years ago

This is all up for discussion but I'm dropping the idea here so that we can, um, discuss it :)

This library can be used for both limiting the rate of requests based on certain attributes (for protection against misbehaving callers), and also to enforce quotas allocated to certain resources (think customers, individual users, etc.). Those are very similar, but rate limiting often is done on a per-second basis, whereas quotas can use much larger intervals like hours or days. I think we could benefit by using a different implementation for those.

In particular, for rate-limiting we might be able to do a proper job without having to actively share data between nodes (using Redis for example). If we can somehow have a mostly up to date value for the number of servers that are actively processing requests (this might prove tricky), we could use an in-memory storage that only remembers the metrics for the current second. Each node would only allow MAXQPS/NODECOUNT queries per second.

Then, since we'd be using shared storage only for quotas, we could space out the intervals where we push data to it, reducing the load on that central server.

Sytten commented 7 years ago

I like the idea. My only concern is mostly if we want to adapt the lib to have quotas and limits since we can already do both with limits. For example, if the user by some way determine the number of nodes (a count in redis for example) and then create a limit per 1s in an async storage that sync each 10s, it will never sync the limits (except maybe once when the cache sync with redis).

That being said, I always thought that it would be a good idea to know our neighbor nodes in the lib to create predictive models, so that count be a first step toward that.

malaporte commented 7 years ago

I know it kinda works with the current implementation, but maybe it's a bit on the heavy side just to limit a rate that is always expressed in "events per second"?

Sytten commented 7 years ago

I thought about it, if we have a good load balancer that would be fine. As to how to implement it, maybe each proxy should a GUID and a key in readi that expire after 2 min. Each minute each proxy reset the timeout and looks for all keys starting with proxy.node.* and store the value in memory. The quotas could be a child class of Limit, but it would be the same under the hood. The only difference being that it would not attempt to share the limit and that the capacity would be divide by the number of neighbors when verifying the limit total. @malaporte @pastjean @francistb What do you think?

malaporte commented 7 years ago

The more I think of it the more I realize that getting a good count for the number of active nodes will be tricky. For example, how do you avoid counting nodes that aren't registered in the load balancer? Or the ones behind another load balancer, if using an environment with a hot swap? Etc...

Maybe it's not such a good idea after all :)

Sytten commented 7 years ago

Yea, maybe we should let the users worry about getting the right number of active nodes since its specific for each usage of the lib. Still we could add a "rate" limit who would not be distributed when using an async storage.

GuiSim commented 7 years ago

If we can get the count to be reliable and scalable, I don't think we'll need two libraries.

You guys are testing it with some pretty heavy load at the moment I think, you're in a good spot to assess if we need to split the two approaches or if we can have a reliable count and thus reliable quotas.

malaporte commented 7 years ago

@Sytten has more details, but in order to implement reliable rate limiting (down to seconds) in a distributed fashion requires quick synchronization of the shared Redis storage, and this pushes too much of a burden on the server. Hence my idea of doing this in a different fashion without sharing data between nodes...

Sytten commented 7 years ago

Redis can take a lot, but since we use the same redis for all the platform it could easily become a point of failure / bottleneck. Our need was to limit the rate of requests and not just the overall count. We figured out that this count could not be distributed since we want to have buckets in seconds and its a no go for scale. Hence, thats the reason for the isDistributed flag you seemed to wonder about in the PR. We want to use the same storage, but not sync the rate limit.

GuiSim commented 7 years ago

Perhaps using a dedicated module like https://github.com/brandur/redis-cell could help?

Redis 4.0 is now available and makes using modules much more straightforward.

malaporte commented 7 years ago

Indeed! I saw the announcement during the weekend, and this looks like a very promising module. It would tie us to Redis though (unless we want to keep on maintaining a separate in-memory implementation), but maybe that's fine.

Sytten commented 7 years ago

I'm not against it, but I'm not sure its ready for production. Even the module github page says so. And I agree it would tie to redis unless we use a flag like we currently do in my PR and we make and we change the interface for the redis storage. For now I think we should stick to distributed or not and plan this for a next release.

coveooss / spillway

Have different way to enforce "rate limiting" vs "quotas" #31