[Feature]: Add flag to disable compression on upstream requests?

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Other

13.04k stars 1.53k forks source link

The Feature

Similar to #3958 / #3533, LiteLLM might get a performance boost by disabling gzip on upstream LLM requests.

See https://github.com/encode/httpx/discussions/2220#discussion-4063893 for more details on how to disable compression. tl;dr: set the Accept-Encoding: identity header.

Motivation, pitch

https://medium.com/@juan.deaton/gzip-data-compression-has-a-critical-role-of-increasing-performance-and-smashing-the-monstrous-caae5b4a001c#:~:text=CPUs%20are%20Inefficient%20at%20Performing%20GZIP&text=With%20no%20GZIP%2C%20the%20web,and%20consumes%207x%20more%20energy.

Twitter / LinkedIn details

https://www.linkedin.com/in/davidmanouchehri/

BerriAI / litellm

[Feature]: Add flag to disable compression on upstream requests? #3964

The Feature

Motivation, pitch

Twitter / LinkedIn details