Closed denisergashbaev closed 1 month ago
LiteLLM provides rate limit aware routing strategy that routes the call to the deployment with the lowest tokens per minute value (see https://github.com/BerriAI/litellm/discussions/4510, https://docs.litellm.ai/docs/routing#advanced---routing-strategies-%EF%B8%8F).
How could we configure it in DSPy?
essentially a duplicate of https://github.com/stanfordnlp/dspy/issues/1570
LiteLLM provides rate limit aware routing strategy that routes the call to the deployment with the lowest tokens per minute value (see https://github.com/BerriAI/litellm/discussions/4510, https://docs.litellm.ai/docs/routing#advanced---routing-strategies-%EF%B8%8F).
How could we configure it in DSPy?