BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.55k stars 1.46k forks source link

[Feature]: `supports_prompt_caching` property for LMs #5776

Closed neubig closed 1 week ago

neubig commented 1 week ago

The Feature

Currently, claude and gemini support prompt caching, but other LMs don't. It'd be great if there could be a supports_prompt_caching property (like supports_vision etc.) that tells us which models support prompt caching.

Motivation, pitch

In OpenHands we would like to turn prompt caching on by default, but we can't easily do so without this feature, because if we attempt to use prompt caching with an LM that doesn't support it then it throws an error.

We tried to fix this by checking for claude in the model name, but that doesn't work because claude on GCP apparently doesn't support prompt caching yet.

Twitter / LinkedIn details

No response