Feat/custom rates support

sourabpramanik commented 7 months ago

This PR adds custom rates along with default rate limiting. Algorithms supporting custom rate limiting:

[x] Fixed window
[x] Sliding window
[x] Token bucket
[x] Cached fixed window
[x] Multi-region Fixed window
[x] Multi-region Sliding window

Issue: #75

vercel[bot] commented 7 months ago

@sourabpramanik is attempting to deploy a commit to the Upstash Team on Vercel.

A member of the Team first needs to authorize it.

sourabpramanik commented 6 months ago

@off99555 @ogzhanolguncu Updated as per the reviews, moving on to the other algorithms!

enesakar commented 6 months ago

will the old api continue to work?

limiter: Ratelimit.slidingWindow(10, "10 s")

sourabpramanik commented 6 months ago

Yes it will work the same way it does now

ogzhanolguncu commented 6 months ago

This is much better. I still have a couple concerns:

1- Not sure if we should call it payloadSize and payloadLimit I think the names should more closely reflect the problem they're intended to solve, as the current names are too technical. 2- Maybe we should start moving script files to their own folders. 3- Where are test files 😂

sourabpramanik commented 6 months ago

@ogzhanolguncu 1 - I will take a look at it again 2 - you mean to create a script file and read from it? 3- yes I have to write the tests sorry about that🥲

ogzhanolguncu commented 6 months ago

Good 👌🏻. Yeah, something like/lua-scripts/payload-limit-script.ts. I don't know it's just an idea, but I feel like scripts starting to bloat the file. Test are crucial we need to make sure we didn't break anything.

sourabpramanik commented 6 months ago

@ogzhanolguncu How about dataLimit or contentLimit?

ogzhanolguncu commented 6 months ago

@ogzhanolguncu How about dataLimit or contentLimit?

Let's move on for now with other implementations. We can think more about it once everything is done.

sourabpramanik commented 6 months ago

Sure thing 💯

fahreddinozcan commented 6 months ago

Hey @sourabpramanik, just wanted to check in again to make sure that we're on the same page about some function signatures. I think the good thing about this library is that it handles a very specific task very well, and it doesn't add any complexity to it.

In the current proposal, the way to define a new ratelimiter is Ratelimit.fixedWindow(10, "10 s", 400). When I define a ratelimiter for any purpose, it's necessary that I seperate that ratelimit from any other condition in the context. In this case, let's say I have 400 total custom rate tokens and I will be consuming them 10 by 10. Defining an extra rate here actually makes the job harder, because now I have to calculate the maximum request count as well, which is irrelevant in my case. To overcome this, I will make the token count something huge instead of 10.

When I define an algorithm, the token count can take any value, and with the custom rates I can use these resources(tokens) for any purpose with any rate I want. So If I want to ratelimit multiple things at the same time, I think it's better to create multiple ratelimit clients. This is because, ratelimiting both the request count and let's say the total token count in OpenAI completions is not the major case, and they can be seperated.

What I'd suggest is creating the window like Ratelimit.fixedWindow(400, "10s") and limit it with Ratelimiter.limit(identifier, {rate: 100}) for the custom rate limiter. I think the good thing about this library is that, it does a very specific thing very simple without involving any complexity.

I think we should be talking about this major thing explicitly. I'd love to hear your opinion and motivation on this, it feels like it'd be a fruitful discussion

sourabpramanik commented 6 months ago

Hey @sourabpramanik, just wanted to check in again to make sure that we're on the same page about some function signatures. I think the good thing about this library is that it handles a very specific task very well, and it doesn't add any complexity to it.

In the current proposal, the way to define a new ratelimiter is Ratelimit.fixedWindow(10, "10 s", 400). When I define a ratelimiter for any purpose, it's necessary that I seperate that ratelimit from any other condition in the context. In this case, let's say I have 400 total custom rate tokens and I will be consuming them 10 by 10. Defining an extra rate here actually makes the job harder, because now I have to calculate the maximum request count as well, which is irrelevant in my case. To overcome this, I will make the token count something huge instead of 10.

Hey @fahreddinozcan thanks for your review Please let me know if I am wrong, so based on the above quote, when you want to limit the requests based on some extra parameters in that case only you will need to specify the custom rate like this new RateLimit(10, '10s', 400) which means you now have two threshold point for your rate limiter. Whenever any one of the thresholds is met by any of the requests the upcoming requests will be dropped. And of course, if you do not want to rate limit with any custom rate then you do not have to specify anything when you call the client. Just a standard ratelimit

And can you please explain what "I have to calculate the maximum request count" implies here in this context?

sourabpramanik commented 6 months ago

When I define an algorithm, the token count can take any value, and with the custom rates I can use these resources(tokens) for any purpose with any rate I want. So If I want to ratelimit multiple things at the same time, I think it's better to create multiple ratelimit clients. This is because, ratelimiting both the request count and let's say the total token count in OpenAI completions is not the major case, and they can be separated.

I agree it might be easier if you can create two separate clients to handle their tasks and make the implications much clearer.

sourabpramanik commented 6 months ago

What I'd suggest is creating the window like Ratelimit.fixedWindow(400, "10s") and limit it with Ratelimiter.limit(identifier, {rate: 100}) for the custom rate limiter. I think the good thing about this library is that, it does a very specific thing very simple without involving any complexity.

But how are we going to decide that the limit is for 400 requests in a 10sec window or 400 payload size in a 10sec window?

sourabpramanik commented 6 months ago

I think we should be talking about this major thing explicitly. I'd love to hear your opinion and motivation on this, it feels like it'd be a fruitful discussion

The function signature in the current state is not rigid, meaning you can either opt in for a custom rate or opt-out only when you create an instance. The process remains the same for the standard rate limiting.

I appreciate that you took the time to review this whole proposal. Initially, I thought it would be a small feature but as I work on it every day and get a lot of feedback from you guys, it now has reached a whole different aspect for me and I have been researching and reading articles about how to design this new feature that it doesn't break the standard functionality of a rate limiter and reinforces the existing.

I also intend to build this new feature without adding any complexities and redundancy.

fahreddinozcan commented 6 months ago

Hey @sourabpramanik, just wanted to check in again to make sure that we're on the same page about some function signatures. I think the good thing about this library is that it handles a very specific task very well, and it doesn't add any complexity to it. In the current proposal, the way to define a new ratelimiter is Ratelimit.fixedWindow(10, "10 s", 400). When I define a ratelimiter for any purpose, it's necessary that I seperate that ratelimit from any other condition in the context. In this case, let's say I have 400 total custom rate tokens and I will be consuming them 10 by 10. Defining an extra rate here actually makes the job harder, because now I have to calculate the maximum request count as well, which is irrelevant in my case. To overcome this, I will make the token count something huge instead of 10.

Hey @fahreddinozcan thanks for your review Please let me know if I am wrong, so based on the above quote, when you want to limit the requests based on some extra parameters in that case only you will need to specify the custom rate like this new RateLimit(10, '10s', 400) which means you now have two threshold point for your rate limiter. Whenever any one of the thresholds is met by any of the requests the upcoming requests will be dropped. And of course, if you do not want to rate limit with any custom rate then you do not have to specify anything when you call the client. Just a standard ratelimit

And can you please explain what "I have to calculate the maximum request count" implies here in this context?

So let's say I have 1000 tokens, and I will consume them by a rate of 100-150 but it might even g o up to 300. I created the window as fixedWindow(10, "10s", 1000), but I don't really need to rate limit the request count by any means. In this case, in order to not get blocked by the request count I should increase the request count to meaningless high number like 200 etc. If someone wants to ratelimit something with a custom rate, then they should be able to apply the limit based on that custom limit. Even though one can only limit based on request count with a rate of 1, it's not possible to ratelimit purely based on the custom rate. If someone wants to limit based on both the request count and the custom rate, then they'd be free to do so with creating to client.

fahreddinozcan commented 6 months ago

What I'd suggest is creating the window like Ratelimit.fixedWindow(400, "10s") and limit it with Ratelimiter.limit(identifier, {rate: 100}) for the custom rate limiter. I think the good thing about this library is that, it does a very specific thing very simple without involving any complexity.

But how are we going to decide that the limit is for 400 requests in a 10sec window or 400 payload size in a 10sec window?

The thing is, we don't. The tokens are generic things. They don't need to be specific for requests or anything else. User is free to use them for any purpose in their own context. By keeping the token type to at 1 -as it currently is- we would protect this generic feature. If we introduce second token type, the logic goes a bit complex than this library intends to

sourabpramanik commented 6 months ago

I see what you mean now. So having both request rate limit and custom rate limit does not make any sense of having a request rate limit if the custom rate limit is very high. I agree it will be ignored in such a case.

Rather having a different client to only handle the custom rate limit will make more sense. And every client will be responsible for performing one task and one task only.

sourabpramanik commented 6 months ago

What I'd suggest is creating the window like Ratelimit.fixedWindow(400, "10s") and limit it with Ratelimiter.limit(identifier, {rate: 100}) for the custom rate limiter. I think the good thing about this library is that, it does a very specific thing very simple without involving any complexity.

But how are we going to decide that the limit is for 400 requests in a 10sec window or 400 payload size in a 10sec window?

The thing is, we don't. The tokens are generic things. They don't need to be specific for requests or anything else. User is free to use them for any purpose in their own context. By keeping the token type to at 1 -as it currently is- we would protect this generic feature. If we introduce second token type, the logic goes a bit complex than this library intends to

Interesting, so let's just say I have a rate limit of 400 tokens in 10sec window. Now unless I explicitly provide the rate at which these tokens should increment it will always increment by one. If the rate of increment is provided then this will override the default behaviour. Is that what you mean?

sourabpramanik commented 6 months ago

@ogzhanolguncu Can you please look at the test changes I have made?

I have updated the tests which cover both aspects with and without custom rates. Please let me know if that is the right approach or if you need any changes.

sourabpramanik commented 6 months ago

Hey guys, how can I test all the regions and algorithms at once? Because of the limitations on my end, I cannot test multi-region and single-region together and hit the command limits for the day in a single test run.

upstash / ratelimit-js

Feat/custom rates support #79