paul-gauthier / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
12.56k stars 1.2k forks source link

Feature Request: Review Mode with Checklist for Aider #723

Open Yona544 opened 6 days ago

Yona544 commented 6 days ago

Context: When I ask Aider to review my code, it often identifies issues and suggests corrections. While some of these corrections are accurate, others are based on design choices that do not need to be changed. Currently, Aider goes ahead and implements all suggested changes and commits them, which sometimes results in unnecessary modifications that I need to undo, leading to diminishing returns.

Request: I propose adding a "Review Mode" switch in the Aider browser interface, with an accompanying checklist feature. This mode would allow users to control how Aider behaves when reviewing code. In Review Mode, Aider would:

  1. Identify and list potential issues and suggestions.
  2. Provide a clear distinction between critical issues and those that might be intentional design choices.
  3. Present a checklist of identified issues, allowing users to selectively check off the items they want to be fixed.
  4. Allow users to selectively apply changes based on the review feedback.
  5. Provide an option for users to add new issues to the checklist during the review process.

Workflow:

  1. Review Identification: Aider scans the code and lists potential issues and suggestions.
  2. User Interaction: Users review the checklist, check off the items they want to be addressed, and have the option to add new issues.
  3. Selective Fixes: Aider only applies fixes to the checked items, respecting the user's design choices.
  4. Iterative Review: Users can choose to re-evaluate the checklist after initial fixes to address more items if necessary.

Benefits:

Implementing this feature would streamline the code review process, ensuring that changes made by Aider align more closely with the user's intentions and providing a more interactive and user-friendly experience.


Aider v0.40.0 Models: claude-3-5-sonnet-20240620 with diff edit format, weak model claude-3-haiku-20240307 Git repo: \192.168.0.35\YonaVmDataShare\Projects\posexport.git with 28 files Repo-map: using 1024 tokens Restored previous conversation history.

unphased commented 5 days ago

I agree totally with this concept. It seems to already exist as we already get prompted by aider to do things like add files that the LLM suggests it wants access to, and for the LLM to add new files and such things. I have more applications of this concept to add to the discussion as well.

One very big bottleneck that high level LLM tools like aider address very well is the issue of repetitively copy-pasting code. Now we do have to repeat our code in the prompts a significant amount, because that's what's required to update the LLM's understanding of changes to the code which we as the primary programmer are making. Sometimes these updates may be compressed with descriptions or diffs, but just like with math, an LLM is poor at tracking mutated state.

As we blow past this bottleneck, then, the tradeoff is API token consumption which costs directly scale with. When we do the caveman copy paste workflow, we manually manipulate text and get a deeply intuitive sense of token consumption. Granted, with chatbot usage the pricing isn't per-token, but we need to manage token consumption for the purpose of keeping important information in-context anyhow, as results drop off a cliff when we run afoul of that.

What I'm trying to get at is with an open-source tool like aider we have a unique opportunity to give excellent UX that no AI vendor will ever offer, that is to say something designed to help the user control resource consumption.

So I would suggest for Review Mode to expand its scope beyond just being an approval process for changes: I want to apply this approval process also to prompts whose token consumption exceed a certain threshold (heuristically or otherwise) as one method to help us keep token related costs in check.

A lite version of this could be some sort of "more raw" debug log to gain access to where we can view all of the raw tokens being sent in and received from the LLM. Even if no Review Mode approval interaction exists we could use a raw communication log to at least see how the tool's automation works so we can learn how to better control it.

One thing i'm not clear on is whether aider could automatically endeavor to autorespond with increased contextual detail if it becomes obvious somehow that not enough context was provided initially. If this were to happen, it may lead to greatly increased or even runaway token consumption.

Edit: OK I just came across https://github.com/paul-gauthier/aider/issues/127#issuecomment-1641587378 and it led me to discover the verbose flag, which is very useful in this regard as it appears to be doing exactly what my "lite version" concept above is, letting me see exactly what's being sent and received so I can gauge if I'm being efficient in token usage. the /tokens command is also a super useful thing to help us predict the cost of a next instruction we may send.

I'm REALLY impressed with this tool right now. I might suggest two minor quality of life things related to verbose:

paul-gauthier commented 3 days ago

You might find these docs helpful:

https://aider.chat/docs/config/options.html#--llm-history-file-llm_history_file

If you run aider with --no-stream it will output costs after each message.

unphased commented 3 days ago

Is there a way to get it to display the cost even when streaming?

paul-gauthier commented 16 hours ago

Unfortunately, no. The streaming API doesn't return cost info.