upstash / ratelimit-js

Rate limiting library for serverless runtimes
https://ratelimit-with-vercel-kv.vercel.app
MIT License
1.65k stars 33 forks source link

custom rates support #75

Closed enesakar closed 6 months ago

enesakar commented 9 months ago

in the ratelimit sdk, is there a way consume with different rates? e.g. in openai api, I will allow 100 words per hour so I need to count the words in the prompt and consume it. so something like: ratelimit.limit(identifier, wordCount)

github-actions[bot] commented 8 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 30 days.

ogzhanolguncu commented 8 months ago

bump

sourabpramanik commented 7 months ago

@enesakar @ogzhanolguncu I was able to create this feature, I only did it for fixed window for now and I tested it as well it works fine. But I want to know how many such rates can we expect from a user, I mean a user can add only one extra rate(like you have mentioned) or there can be many of them?

What I am planning to do is add an object of custom rates having unique names and max values, so the instance will be like this:

const ratelimit = new Ratelimit({
  redis: kv,
  limiter: Ratelimit.fixedWindow(10, "60s", {
    words: 100,
  }),
});

You can only send prompts of words at max 100 in the given window. Now to limit we can do something like this:

export default async function Home() {
  const ip = headers().get("x-forwarded-for");
  const { success, limit, remaining, reset } = await ratelimit.limit(
    ip ?? "anonymous",
    undefined,
    {
      words: 20,
    }
  );

this will increment the word count by 20(will be dynamic based on the prompt) at every request.

What do you think? Is there any better approach then please let me know

PS: Refactored the snippets and removed array args

ogzhanolguncu commented 7 months ago

I feel like I still don't understand the practicality of this. What problem are we exactly trying to solve? @sourabpramanik, can you walk me through your example case so I can have a better understanding of the situation?

enesakar commented 7 months ago

In AI APIs the quotas are per token not per request. I need to limit the rate by token count. so each request has different token/weight. currently this is not possible as each request is one.

sourabpramanik commented 7 months ago

Sure, so we are already rate-limiting based on only one factor that is per request. Now we want to have some custom limiters like for example word counts. So these custom limiters can be of any type, having different rates, limits, and as many as the user wants to have. At each request, the counter will increase at a specific rate whichever factor or rate hits the limit first in the given window will block further requests.

I have modified the fixed window algorithm for now such that it can work for these cases. If you want I can raise a PR so you may understand better

sourabpramanik commented 7 months ago

Sorry for the delay, @enesakar @ogzhanolguncu please review the PR and give your feedback.

offchan42 commented 7 months ago

A good use case for this feature is for subtracting credits from a user's account. Currently the limit function always subtract 1 credit from the account. @sourabpramanik Looking at the API proposal above, I think it's a bit complicated. Is it possible for the API to simply ask for number (e.g. credits) to subtract? That would be the cleanest API for me. You might want to subtract 1.2 credits sometimes, or sometimes 0.5, or sometimes 10. If float is not allowed, int is also fine.

sourabpramanik commented 7 months ago

A good use case for this feature is for subtracting credits from a user's account. Currently the limit function always subtract 1 credit from the account. @sourabpramanik Looking at the API proposal above, I think it's a bit complicated. Is it possible for the API to simply ask for number (e.g. credits) to subtract? That would be the cleanest API for me. You might want to subtract 1.2 credits sometimes, or sometimes 0.5, or sometimes 10.

Hey @off99555 by credits do you mean tokens? If yes then when you init the RateLimit you need to specify the max tokens and the limit function call will subtract tokens which can be dynamic and not necessarily be one token, but yes I have to check the case of floating tokens maybe rounding off will be much better. I hope I got you right.

Checkout the example usage https://github.com/sourabpramanik/upstash-ratelimit/blob/64dd2c6273c369bac431a6ab278b0626ad1127ff/examples/with-vercel-kv/app/page.tsx#L8C1-L26C1

offchan42 commented 7 months ago

@sourabpramanik Can the API be like this? I think we just need a number and no words.

const { success, limit, remaining, reset } = await ratelimit.limit(
    ip ?? "anonymous",
    undefined,
    20, // a custom deduction
  );

Or instead of the number, it can also be an object {subtract: 20} to allow for future change in the API.

sourabpramanik commented 7 months ago

Yes, we can create two variants, one variant is for implementing one custom rate without any identifier since there is only one custom rate just like you mentioned, and another variant will be for multiple custom rates which will need identifiers because we need to track the token for exhausting. What do you think?

sourabpramanik commented 7 months ago

I think having multiple custom rates is very expensive because for every request we have to run a loop to execute the script to reduce the token before any one of them has exhausted.

ogzhanolguncu commented 7 months ago

I believe we should avoid overcomplicating the API. The initial implementation proposal was fine. And, let's add a more concrete example to ensure we all on the same page. Preferably, using OpenAI with chat completion or embeddings creation.

offchan42 commented 7 months ago

I believe we should avoid overcomplicating the API. The initial implementation proposal was fine. And, let's add a more concrete example to ensure we all on the same page. Preferably, using OpenAI with chat completion or embeddings creation.

Yes, the API should be simple. for my use case I simply want a way to subtract different amount of values for my AI image generation app. For example, if the user is generating a big image, I want it to cost 2 tokens. If it's a small image, it's 1 token. Something like that. No need for named custom rates. I just need a way to send this arbitrary amount of tokens to the limit function.

sourabpramanik commented 7 months ago

Yes I agree I will rework on the api and let you all know

ogzhanolguncu commented 7 months ago

@off99555 I like the idea. Yeah, users definitely should be able to pick their own subtraction rates.

sourabpramanik commented 7 months ago

@ogzhanolguncu @off99555 I have updated to API and the example app. Can you guys please check if this works so then I can implement the same logic in the rest of the algorithms and multi-region algorithms.

enesakar commented 7 months ago

@sourabpramanik, can you please share how the end user API will change? Perhaps you can commit the new README so we can easily see the new user API.

This is a must-have feature for AI applications (due to tokens), so your effort is greatly appreciated!

sourabpramanik commented 7 months ago

@sourabpramanik, can you please share how the end user API will change? Perhaps you can commit the new README so we can easily see the new user API.

This is a must-have feature for AI applications (due to tokens), so your effort is greatly appreciated!

Yes definitely I will do that

sourabpramanik commented 6 months ago

@enesakar as mentioned I have added a small doc on the change the API will have

enesakar commented 6 months ago

@sourabpramanik I added a comment.