Prompt Caching is a powerful feature that optimizes the API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements.
While cache write tokens are 25% more expensive than base input tokens, cache read tokens are 90% cheaper than base input tokens.
Proposed solution
Allow optional prompt caching for the Anthropic provided models. And maybe for the Anthropic models provided by the CodeGPT provider.
Describe the need of your request
Prompt Caching is a powerful feature that optimizes the API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements.
While cache write tokens are 25% more expensive than base input tokens, cache read tokens are 90% cheaper than base input tokens.
Proposed solution
Allow optional prompt caching for the Anthropic provided models. And maybe for the Anthropic models provided by the CodeGPT provider.
Additional context
Source: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching