BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.07k stars 1.66k forks source link

fix latency issues on google ai studio #6852

Closed ishaan-jaff closed 6 hours ago

ishaan-jaff commented 16 hours ago

These fixes got google ai studio median latency to ~40-100ms

Pre PR latency was 900ms

Relevant issues

Type

๐Ÿ†• New Feature ๐Ÿ› Bug Fix ๐Ÿงน Refactoring ๐Ÿ“– Documentation ๐Ÿš„ Infrastructure โœ… Test

Changes

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

If UI changes, send a screenshot/GIF of working UI fixes

[!NOTE] I'm currently writing a description for your pull request. I should be done shortly (<1 minute). Please don't edit the description field until I'm finished, or we may overwrite each other. If I find nothing to write about, I'll delete this message.

vercel[bot] commented 16 hours ago

The latest updates on your projects. Learn more about Vercel for Git โ†—๏ธŽ

Name Status Preview Comments Updated (UTC)
litellm โœ… Ready (Inspect) Visit Preview ๐Ÿ’ฌ Add feedback Nov 21, 2024 4:35pm