[Feature]: allow setting duration for parallel requests (e.g. per day, per hour, etc.)

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.38k stars 1.44k forks source link

[Feature]: allow setting duration for parallel requests (e.g. per day, per hour, etc.) #1256

Open krrishdholakia opened 9 months ago

krrishdholakia commented 9 months ago

The Feature

Allow setting duration for max parallel requests

Motivation, pitch

User built their own version of this, where they set the duration on a per day (12 hrs) basis. And it was different for different users.

Twitter / LinkedIn details

No response

krrishdholakia commented 9 months ago

this could then be set as the ttl on the cache https://github.com/BerriAI/litellm/blob/3026e5aa580c1a7431ffd35b4d80b300e771e29b/litellm/proxy/hooks/parallel_request_limiter.py#L44