Any plans to add prompt caching for Anthropic?

tavurth commented 2 months ago

Why?

When you send a request with Prompt Caching enabled:

The system checks if the prompt prefix is already cached from a recent query. If found, it uses the cached version, reducing processing time and costs. Otherwise, it processes the full prompt and caches the prefix for future use.

Cache write tokens are 25% more expensive than base input tokens

Cache read tokens are 90% cheaper than base input tokens

Regular input and output tokens are priced at standard rates

This is especially useful for:

Prompts with many examples

Large amounts of context or background information

Repetitive tasks with consistent instructions

Long multi-turn conversations

Seems like it can be super useful for large codebases where only a small section needs to be updated each time. Can basically send it all over and it'll be cached each time.

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#how-prompt-caching-works

CestDiego commented 2 months ago

this would be perfect for having a similar feature to projects from Anthropic as well

karthink commented 2 months ago

Yup, will add it soon.

karthink / gptel

Any plans to add prompt caching for Anthropic? #365