karthink / gptel

A simple LLM client for Emacs
GNU General Public License v3.0
1.58k stars 152 forks source link

Any plans to add prompt caching for Anthropic? #365

Open tavurth opened 2 months ago

tavurth commented 2 months ago

Why?

When you send a request with Prompt Caching enabled:

The system checks if the prompt prefix is already cached from a recent query. If found, it uses the cached version, reducing processing time and costs. Otherwise, it processes the full prompt and caches the prefix for future use.

  • Cache write tokens are 25% more expensive than base input tokens
  • Cache read tokens are 90% cheaper than base input tokens
  • Regular input and output tokens are priced at standard rates

This is especially useful for:

  • Prompts with many examples
  • Large amounts of context or background information
  • Repetitive tasks with consistent instructions
  • Long multi-turn conversations

Seems like it can be super useful for large codebases where only a small section needs to be updated each time. Can basically send it all over and it'll be cached each time.

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#how-prompt-caching-works

CestDiego commented 2 months ago

this would be perfect for having a similar feature to projects from Anthropic as well

karthink commented 2 months ago

Yup, will add it soon.