Open JoziGila opened 3 weeks ago
Would love to see this merged in ASAP
Thanks for the PR! I've tried a very similar solution before but did not have great results for complicated tasks, ie lots of syntax errors or claude would just not make the right changes. I also anticipate anthropic's upcoming fast edit mode will help against lazy coding and give faster results, but until then I'll test this diff approach again and see if I can get it in a mostly good enough state although I'll probaby have users manually opt in to using it.
This is a nice PR since it can work with large file, I just tested it on my project, previously Claude Dev can't handle my gui file which is over 500 lines which result to multiply retrying error messages, but with this PR it can work on that file.
But as saoudrizwan sad, it will make lots of syntax errors, which need some fixing.
But as saoudrizwan sad, it will make lots of syntax errors, which need some fixing.
@AlexanderHel What type of syntax errors have you been seeing? I use this approach with GPT4o and find it's excellent for reducing output size and the code quality seems fine. Still experimenting with Claude Sonnet.
Adding "(diff -u)"
was a good idea, I got claude to consistently output valid diffs this way (only had ~70% success rate without it) but still running into some problems I've ran into before with structured outputs. It cant do basic tasks–cursor talks about this in a blog post but basically its trained on way less patch diffs than entire code files so it just sucks at writing code through diffs. I couldnt even get it to make a snake game on the first try... I then tried giving it the option of using full contents OR diff format, and it almost always chose full contents no matter how small the changes were.
I suspect adding in a diff format is gonna lead to pretty bad results for a lot of people that rely on claude dev so I'm hesitant to add it until I at least get the chance to try anthropic's fast edit mode.
An interesting problem for sure, Im still thinking on how to improve this. https://aider.chat/docs/leaderboards/#notes-on-benchmarking-results
Imo the key to prompt with sufficiently neutral language that Claude is not biased to prefer diffs or full page replacements. Giving it options more that restrictions.
Claude now responds with only the diff (when editing a file) and applies it, reducing token usage a lot. The writeToFile is still there but its preferred when creating new files, while applyDiff when editing.