Aider-AI / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
20k stars 1.83k forks source link

Prompt caching with open router LLMs #1135

Closed andrewsgeller closed 1 month ago

andrewsgeller commented 1 month ago

Issue

Hi, does cache prompting still work when passing an Anthropic model through open router, like “—model openrouter/anthropic/claude-3.5-sonnet —cache-prompts”? When I’ve tested that previously I don’t get the little lightning bolt shown when prompt caching is active.

Thanks for all the amazing work on this project.

Also, is there any data on relative performance of the weak model selection in terms of the repo map summarization?

Version and model info

0.50.1 with open router/anthropic/claude-3.5-sonnet

fry69 commented 1 month ago

OpenRouter currently does not support prompt caching, but this might change in the days (or so, no guarantee).

paul-gauthier commented 1 month ago

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

avimar commented 4 weeks ago

Looks like it's supported now on openrouter: https://openrouter.ai/docs/prompt-caching

fry69 commented 4 weeks ago

Looks like it's supported now on openrouter:

No official announcement yet and no report that it actually works. Also the LiteLLM database needs to be updated for aider to pick up this change.

paul-gauthier commented 4 weeks ago

Support is available in the main branch. You can get it by installing the latest version from github:

aider --install-main-branch

# or...

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

If you have a chance to try it, let me know if it works better for you.

andrewsgeller commented 4 weeks ago

Wow this is kind of big news. The unlimited tokens aspect of openrouter is really beneficial to those of us who hit the rate limit pretty quick.

Get Outlook for iOShttps://aka.ms/o0ukef


From: paul-gauthier @.> Sent: Friday, September 6, 2024 9:19 AM To: paul-gauthier/aider @.> Cc: Andrew Geller @.>; Author @.> Subject: Re: [paul-gauthier/aider] Prompt caching with open router LLMs (Issue #1135)

Support is available in the main branch. You can get it by installing the latest version from github:

aider --install-main-branch

or...

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

If you have a chance to try it, let me know if it works better for you.

— Reply to this email directly, view it on GitHubhttps://github.com/paul-gauthier/aider/issues/1135#issuecomment-2334039543, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJ65MRUWJOTJBZJY64LOA4DZVGTUPAVCNFSM6AAAAABM2KLXSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZUGAZTSNJUGM. You are receiving this because you authored the thread.Message ID: @.***>

lgandecki commented 1 week ago

I'm confused, @paul-gauthier did you add prompt caching for Claude in general, or did you make it work with OpenRouter? @andrewsgeller do you have this working? I'm confused what you mean by the "unlimited tokens" in this context

fry69 commented 1 week ago

@lgandecki

aider supports Anthropic prompt caching via OpenRouter since #1397

See also -> https://aider.chat/docs/usage/caching.html

And here -> https://openrouter.ai/docs/prompt-caching

The only limitation is that you won't see the cached token amount via OpenRouter, as this information does not get forwarded currently. But you can check via your activity page that price for cached requests is significantly lower -> https://openrouter.ai/activity

lgandecki commented 1 week ago

thanks, maybe it's broken from the client-side then.. Access to fetch at 'https://openrouter.ai/api/v1/messages?beta=prompt_caching' from origin 'app://obsidian.md' has been blocked by CORS policy: Request header field anthropic-beta is not allowed by Access-Control-Allow-Headers in preflight response.

this worked fine through anthropic, but I hit the daily rate limit. Looks like switching to openrouter doesn't help. Thanks for the reply though! I might try to route the requests through a nodejs server

fry69 commented 1 week ago

For OpenRouter you do not need to send any Anthropic HTTP headers.

andrewsgeller commented 1 week ago

Yes , it’s working. As fry said, you just won’t see the token counts.

Sent from my iPhone


From: Łukasz Gandecki @.> Sent: Friday, September 27, 2024 7:50 AM To: paul-gauthier/aider @.> Cc: Andrew Geller @.>; Mention @.> Subject: Re: [paul-gauthier/aider] Prompt caching with open router LLMs (Issue #1135)

I'm confused, @paul-gauthierhttps://github.com/paul-gauthier did you add prompt caching for Claude in general, or did you make it work with OpenRouter? @andrewsgellerhttps://github.com/andrewsgeller do you have this working? I'm confused what you mean by the "unlimited tokens" in this context

— Reply to this email directly, view it on GitHubhttps://github.com/paul-gauthier/aider/issues/1135#issuecomment-2379093267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJ65MRSQ65TWYEYWFAUFOIDZYVA7HAVCNFSM6AAAAABM2KLXSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZZGA4TGMRWG4. You are receiving this because you were mentioned.Message ID: @.***>

lgandecki commented 1 week ago

oh nice!!!! that's a helpful tip. But now I get Request header field anthropic-dangerous-direct-browser-access is not allowed by Access-Control-Allow-Headers in preflight response.

and if I disable that, I get an error: Error: It looks like you're running in a browser-like environment.

I get around that, but then it complained about sending 'anthropic-version', I removed that, but then I got POST https://openrouter.ai/api/v1/messages 405 (Method Not Allowed)

and here is where I hit the wall. I guess I will rewrite my code to use direct fetch instead of Anthropic API for now. Thank you guys for the help!

(the details I add here for someone googling the same problem)