Support Prompt Caching for Anthropic Provider

Describe the need of your request

Prompt Caching is a powerful feature that optimizes the API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements.

While cache write tokens are 25% more expensive than base input tokens, cache read tokens are 90% cheaper than base input tokens.

Proposed solution

Allow optional prompt caching for the Anthropic provided models. And maybe for the Anthropic models provided by the CodeGPT provider.

Additional context

Source: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

carlrobertoh / CodeGPT