Closed almson closed 5 months ago
It does. So it does for each message in chat
. This is how all the LLMs work: they need the whole conversation in the context in order to see it, they don't have any memory.
Yes, it re-sends the whole prompt - as do most of the other LLM plugins.
It does. So it does for each message in
chat
. This is how all the LLMs work: they need the whole conversation in the context in order to see it, they don't have any memory.
Yeah... that doesn't matter.
It's about how the API is built and how billing is calculated. ChatGTP has a session-based API, but it seems Claude doesn't.
That's very unfortunate, especially when I'm trying to work with eg a 100k token codebase. The expense becomes unmanageable. I'll have to stick to the web UI for now.
I'm sorry, I had a wrong idea about OpenAI and Anthropic APIs. Unfortunately, though, it means that if I want to upload a codebase, it will be much cheaper to use Claude's web UI which does not charge me to process the whole codebase on each request.
I am experimenting with Opus, and my input token usage seems very high considering that I've only sent the main data once and then used "llm -c". Does this plugin keep a conversation going, or does it re-send the whole prompt each time?