Open Manouchehri opened 2 months ago
that's interesting - why not just have it be a pre-call check, to filter out the deployments which violate the conditions? this way it would work across all routing strategies
We do this today for context window checks - https://docs.litellm.ai/docs/routing#pre-call-checks-context-window
The Feature
For example, if this request comes in, route it to Vertex AI.
If this request comes in, route it to Gemini (AI Studio).
Motivation, pitch
Right now, Vertex AI (not LiteLLM) is pretty broken when using JSON mode with Gemini 1.5 Pro, it throws 500s on the majority of requests. It would be nice if I could use Gemini (AI Studio) instead of Vertex AI only for the requests that use
response_format
.Twitter / LinkedIn details
https://twitter.com/DaveManouchehri