Open go-run-jump opened 3 weeks ago
v1.5 now has sliding window context management that essentially removes messages from index 2 to ~n/2, this essentially keeps the most imporant message (task + relevant details given to claude in the first message) while also getting rid of older messages. By getting rid of several messages at once we can take advantage of prompt caching, rather than a more dynamic sliding window that might remove 1 message at a time which would keep breaking the cache.
Keeping this issue open for the other suggestions though, I'll update here as I get to them.
v1.5 now has sliding window context management that essentially removes messages from index 2 to ~n/2, this essentially keeps the most imporant message (task + relevant details given to claude in the first message) while also getting rid of older messages. By getting rid of several messages at once we can take advantage of prompt caching, rather than a more dynamic sliding window that might remove 1 message at a time which would keep breaking the cache.
Keeping this issue open for the other suggestions though, I'll update here as I get to them.
I understand you can set up to four separate cache breakpoints. There is probably some optimum way to do this.
@CiberNin I use 1 breakpoint for the system prompt and 2 for the latest messages, so we have 1 left to use how we want. If you have suggestions about how to use it in terms of context management I'd love to hear. I'll think on this and report back if I update it
Background
When implementing a feature, I've noticed that as I get deeper into the task, the LLM starts to struggle with solving issues. Each iteration doesn't necessarily contribute to solving the problem, as the growing context seems to impair its problem-solving abilities.
Observations
Proposed Solutions
1. Context Management
Implement a system to manage the context of tasks. This could include:
2. Message Editing
Add the ability to edit previously sent messages that didn't result in a satisfactory response from the LLM. This could significantly improve efficiency in using the tool.
3. Context Addition Optimization
Investigate and potentially optimize how information is added to the context. The exact logic behind this process I haven't investigated, but there may be room for improvement.
Benefits
Next Steps
Related Issues
106 (regarding the 200,000 token limit)