BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.04k stars 1.53k forks source link

[Feature]: Add flag to disable compression on upstream requests? #3964

Closed Manouchehri closed 4 months ago

Manouchehri commented 4 months ago

The Feature

Similar to #3958 / #3533, LiteLLM might get a performance boost by disabling gzip on upstream LLM requests.

See https://github.com/encode/httpx/discussions/2220#discussion-4063893 for more details on how to disable compression. tl;dr: set the Accept-Encoding: identity header.

Motivation, pitch

https://medium.com/@juan.deaton/gzip-data-compression-has-a-critical-role-of-increasing-performance-and-smashing-the-monstrous-caae5b4a001c#:~:text=CPUs%20are%20Inefficient%20at%20Performing%20GZIP&text=With%20no%20GZIP%2C%20the%20web,and%20consumes%207x%20more%20energy.

Twitter / LinkedIn details

https://www.linkedin.com/in/davidmanouchehri/

krrishdholakia commented 4 months ago

@Manouchehri can we move these from being issues to discussions?

happy to move this to a feature request once if you can confirm you're seeing a perf boost here