Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
13.99k
stars
1.66k
forks
source link
[Feature]: Support upstream batching on Vertex AI (for Google Gemini) #3734
Open
Manouchehri opened 6 months ago
The Feature
Basically the same as #3247, but for Vertex AI. Blocked by #3246 as well.
https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/batch-prediction-api
Motivation, pitch
Cost savings, better rate limits, etc.
Twitter / LinkedIn details
https://twitter.com/DaveManouchehri