paul-gauthier / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
17.51k stars 1.64k forks source link

Explore cursor.sh's 'fast apply' techniques ("fully rewriting the full file outperforms aider-like diffs for files under 400 lines") #625

Closed 0xdevalias closed 2 months ago

0xdevalias commented 3 months ago

Issue

There was a recent thread/blogpost about cursor.sh's recent 'fast apply' changes:

In the thread, they made this comment related to aider's diffs vs speculative edits:

Version and model info

N/A

paul-gauthier commented 3 months ago

They are doing interesting work with this, yes.

I replied in their twitter thread:

Super interesting work! Sounds like you use a strong model (opus/gpt-4o) to generate code changes and a weak model to "apply" them to the file? I've played with this for a ~year, but had concerns:

  1. Adding a 2nd inference step adds latency. Your work helps here!
  2. Can only apply edits to files that fit in the weak model's context & output token limits.
  3. Do you reliably get working code at the end? Success now depends on 2 LLMs not goofing up.

(2) is the biggest concern, since you can't edit large files. Did you evaluate (3)?

Figuring out (2) is a blocker for adopting something like this in aider, where editing large files is a huge benefit. And I'd want to evaluate (3) similar to the way all of aider's editing backends get benchmarked.

Beyond that is the need to actually fine tune and host such a model someplace and make it available to aider users. Since aider is an open source tool that carries a lot of operational overhead and costs that would need to get figured out. And weighed against what benefits.

0xdevalias commented 3 months ago

Their reply on that thread (for context):

2 is definitely a concern and we’re working on solving this with long context extensions

  1. Our evals certainly could use work bc it depends on an LLM grader, rather than running the code/tests (like in aider’s benchmarks). We have a few ideas for improving things here

But we’ve found that letting the models output code in the format they know best (a standard chat response) works very well for planning single-file edits. Better than having the model directly make the change to the entire full file (and faster!). Then it’s just a question of making the apply models accuracy close to 100%, which isn’t terrible bc it is such a simple task

--

Figuring out (2) is a blocker for adopting something like this in aider, where editing large files is a huge benefit. And I'd want to evaluate (3) similar to the way all of aider's editing backends get benchmarked.

@paul-gauthier nods, yeah, that makes sense.

Beyond that is the need to actually fine tune and host such a model someplace and make it available to aider users. Since aider is an open source tool that carries a lot of operational overhead and costs that would need to get figured out. And weighed against what benefits.

@paul-gauthier nods, yeah, true. I think I originally didn't realise that it required a finetuned model/etc, as that seemed to just be for applying the diffs (which aider already handles in it's own way); and thought that maybe the bulk of the benefits could be realised just through the 'speculative decoding' aspect; but to be fair, my knowledge in that space is super limited at best. A few resources I found on it:

paul-gauthier commented 2 months ago

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

0xdevalias commented 2 months ago

and thought that maybe the bulk of the benefits could be realised just through the 'speculative decoding' aspect;

@paul-gauthier Just to confirm, in closing this, is it because you don't believe there are benefits to be gained from the speculative decoding aspect (irregardless of finetunes); or that it doesn't seem a good fit with how aider is currently setup to work; etc?

paul-gauthier commented 2 months ago

I believe speculative decoding would only be used by the actual code that is directly doing model inference. Aider calls out to other systems for inference, even with local models.

0xdevalias commented 2 months ago

@paul-gauthier Yeah ok, that definitely makes sense. Thanks for clarifying :)