Feature Request: Implement Context Management and Message Editing for Improved LLM Performance

go-run-jump commented 3 weeks ago

Background

When implementing a feature, I've noticed that as I get deeper into the task, the LLM starts to struggle with solving issues. Each iteration doesn't necessarily contribute to solving the problem, as the growing context seems to impair its problem-solving abilities.

Observations

Starting a new task, even without the full context of the previous work, often results in higher quality problem-solving and better handling of errors and sub-tasks.
The base message sent with every request grows larger over time, potentially polluting the context.

Proposed Solutions

1. Context Management

Implement a system to manage the context of tasks. This could include:

Displaying information about how many tokens are already used up for chat history (maybe the cached prompts would be the size of this?). There could be an indicator with green, yellow and red, so users can quickly see if they're reaching the maximum possible tokens per request.
Allowing users to start from an earlier state when beginning a new task.

2. Message Editing

Add the ability to edit previously sent messages that didn't result in a satisfactory response from the LLM. This could significantly improve efficiency in using the tool.

3. Context Addition Optimization

Investigate and potentially optimize how information is added to the context. The exact logic behind this process I haven't investigated, but there may be room for improvement.

Benefits

These changes would help users strategically finish tasks before hitting the 200,000 token limit (related to issue #106).
Improved LLM performance by maintaining a more relevant and focused context.
Enhanced user control over the conversation flow and context.

Next Steps

[ ] Research current context management implementation
[ ] Design UI for displaying token usage and context management options
[ ] Implement prototype of message editing feature
[ ] Investigate optimization possibilities for context addition logic
[ ] User testing and feedback collection

Related Issues

106 (regarding the 200,000 token limit)

saoudrizwan commented 1 week ago

v1.5 now has sliding window context management that essentially removes messages from index 2 to ~n/2, this essentially keeps the most imporant message (task + relevant details given to claude in the first message) while also getting rid of older messages. By getting rid of several messages at once we can take advantage of prompt caching, rather than a more dynamic sliding window that might remove 1 message at a time which would keep breaking the cache.

Keeping this issue open for the other suggestions though, I'll update here as I get to them.

CiberNin commented 1 week ago

v1.5 now has sliding window context management that essentially removes messages from index 2 to ~n/2, this essentially keeps the most imporant message (task + relevant details given to claude in the first message) while also getting rid of older messages. By getting rid of several messages at once we can take advantage of prompt caching, rather than a more dynamic sliding window that might remove 1 message at a time which would keep breaking the cache.

Keeping this issue open for the other suggestions though, I'll update here as I get to them.

I understand you can set up to four separate cache breakpoints. There is probably some optimum way to do this.

saoudrizwan commented 1 week ago

@CiberNin I use 1 breakpoint for the system prompt and 2 for the latest messages, so we have 1 left to use how we want. If you have suggestions about how to use it in terms of context management I'd love to hear. I'll think on this and report back if I update it

saoudrizwan / claude-dev