Closed andrewsgeller closed 1 month ago
OpenRouter currently does not support prompt caching, but this might change in the days (or so, no guarantee).
I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.
Looks like it's supported now on openrouter: https://openrouter.ai/docs/prompt-caching
Looks like it's supported now on openrouter:
No official announcement yet and no report that it actually works. Also the LiteLLM database needs to be updated for aider to pick up this change.
Support is available in the main branch. You can get it by installing the latest version from github:
aider --install-main-branch
# or...
python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git
If you have a chance to try it, let me know if it works better for you.
Wow this is kind of big news. The unlimited tokens aspect of openrouter is really beneficial to those of us who hit the rate limit pretty quick.
Get Outlook for iOShttps://aka.ms/o0ukef
From: paul-gauthier @.> Sent: Friday, September 6, 2024 9:19 AM To: paul-gauthier/aider @.> Cc: Andrew Geller @.>; Author @.> Subject: Re: [paul-gauthier/aider] Prompt caching with open router LLMs (Issue #1135)
Support is available in the main branch. You can get it by installing the latest version from github:
aider --install-main-branch
python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git
If you have a chance to try it, let me know if it works better for you.
— Reply to this email directly, view it on GitHubhttps://github.com/paul-gauthier/aider/issues/1135#issuecomment-2334039543, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJ65MRUWJOTJBZJY64LOA4DZVGTUPAVCNFSM6AAAAABM2KLXSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZUGAZTSNJUGM. You are receiving this because you authored the thread.Message ID: @.***>
I'm confused, @paul-gauthier did you add prompt caching for Claude in general, or did you make it work with OpenRouter? @andrewsgeller do you have this working? I'm confused what you mean by the "unlimited tokens" in this context
@lgandecki
aider supports Anthropic prompt caching via OpenRouter since #1397
See also -> https://aider.chat/docs/usage/caching.html
And here -> https://openrouter.ai/docs/prompt-caching
The only limitation is that you won't see the cached token amount via OpenRouter, as this information does not get forwarded currently. But you can check via your activity page that price for cached requests is significantly lower -> https://openrouter.ai/activity
thanks, maybe it's broken from the client-side then.. Access to fetch at 'https://openrouter.ai/api/v1/messages?beta=prompt_caching' from origin 'app://obsidian.md' has been blocked by CORS policy: Request header field anthropic-beta is not allowed by Access-Control-Allow-Headers in preflight response.
this worked fine through anthropic, but I hit the daily rate limit. Looks like switching to openrouter doesn't help. Thanks for the reply though! I might try to route the requests through a nodejs server
For OpenRouter you do not need to send any Anthropic HTTP headers.
Yes , it’s working. As fry said, you just won’t see the token counts.
Sent from my iPhone
From: Łukasz Gandecki @.> Sent: Friday, September 27, 2024 7:50 AM To: paul-gauthier/aider @.> Cc: Andrew Geller @.>; Mention @.> Subject: Re: [paul-gauthier/aider] Prompt caching with open router LLMs (Issue #1135)
I'm confused, @paul-gauthierhttps://github.com/paul-gauthier did you add prompt caching for Claude in general, or did you make it work with OpenRouter? @andrewsgellerhttps://github.com/andrewsgeller do you have this working? I'm confused what you mean by the "unlimited tokens" in this context
— Reply to this email directly, view it on GitHubhttps://github.com/paul-gauthier/aider/issues/1135#issuecomment-2379093267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJ65MRSQ65TWYEYWFAUFOIDZYVA7HAVCNFSM6AAAAABM2KLXSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZZGA4TGMRWG4. You are receiving this because you were mentioned.Message ID: @.***>
oh nice!!!! that's a helpful tip. But now I get Request header field anthropic-dangerous-direct-browser-access is not allowed by Access-Control-Allow-Headers in preflight response.
and if I disable that, I get an error: Error: It looks like you're running in a browser-like environment.
I get around that, but then it complained about sending 'anthropic-version', I removed that, but then I got POST https://openrouter.ai/api/v1/messages 405 (Method Not Allowed)
and here is where I hit the wall. I guess I will rewrite my code to use direct fetch instead of Anthropic API for now. Thank you guys for the help!
(the details I add here for someone googling the same problem)
Issue
Hi, does cache prompting still work when passing an Anthropic model through open router, like “—model openrouter/anthropic/claude-3.5-sonnet —cache-prompts”? When I’ve tested that previously I don’t get the little lightning bolt shown when prompt caching is active.
Thanks for all the amazing work on this project.
Also, is there any data on relative performance of the weak model selection in terms of the repo map summarization?
Version and model info
0.50.1 with open router/anthropic/claude-3.5-sonnet