Open dkahale opened 5 hours ago
Thank you for filing this issue.
A few notes on prompt caching:
According to the Aider docuemntation it says Aider organizes the chat history to try and cache:
shouldn't this mean the /add files should also be cached?
I am not sure how clever aider is about prompt caching, it is only possible to create four different cache blocks. If anything changes in those blocks, the block gets invalidated. So if you only change a single character in the edit files block, the whole block gets stored again (at a 25% premium).
This assumes aider is not trying to store every file in its own block, which will only store the first 4 (probably even less, as blocks get consumed for the system prompt and the repo map too).
In fact, I'd look closely what is actually happening (where those cache_control
block markers get set) in the LLM conversation: Try starting aider with the additional argument --llm-history-file <file>
and analyze what happened after a while (when you are sure that some prompt caching should have happened)
For more details how prompt caching works see here -> https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
Issue
I'm attempting to get the prompt cache to work with Openrouter using Anthropic's Claude Sonnet 3.5. Im using the following aider start prompt:
aider --model openrouter/anthropic/claude-3.5-sonnet --cache-prompts --cache-keepalive-pings 5 --no-stream
It doesn't seem to cache anything. Ive tried adding to read-only, added to chat, and asked it simple questions about the repo. I still get no information regarding the caching and the token costs appear to be the same even when I check on openrouter credit usage.
Has anyone been able to get the prompt caching to work with Sonnet 3.5 on Openrouter yet?
Version and model info
Aider v0.56.0 Main model: openrouter/anthropic/claude-3.5-sonnet with diff edit format, prompt cache, infinite output Weak model: openrouter/anthropic/claude-3-haiku-20240307 Git repo: .git with 3,794 files Warning: For large repos, consider using --subtree-only and .aiderignore See: https://aider.chat/docs/faq.html#can-i-use-aider-in-a-large-mono-repo Repo-map: using 1024 tokens, files refresh