Closed carlos-dc closed 9 months ago
@carlos-dc what LLM are you using? Just curious
@carlos-dc what LLM are you using? Just curious
I am using gpt-4-1106-preview
@carlos-dc thanks. Personally I've noticed a drop in quality with the turbo preview. In addition to an increased response time
Thanks for trying aider and filing this issue. No editing format is going to work 100% reliably with any of the GPT models. I have been using a pair of extensive benchmarking suites to try and make informed, quantitative decisions when implementing and improving the editing formats. But even so, sometimes the LLM will mess up.
So please keep me posted on problems you are seeing. They might hint at possible ways to improve the editing format.
That's a great approach @paul-gauthier, appreciate the effort and that's probably the best one can do
Out of curiosity, how has the new turbo preview model been compared to gpt-4? In your benchmarks
Asking because I was using gpt-4-1106-preview when all of a sudden it got ~2x slower and following instructions poorly enough that I had to switch to the regular gpt-4
On benchmarks gpt-4-1106-preview
seems to do better, but my sense is that gpt-4-0613
might actually be more capable at complex coding.
On benchmarks
gpt-4-1106-preview
seems to do better, but my sense is thatgpt-4-0613
might actually be more capable at complex coding.
Very interesting, thanks
@paul-gauthier:
(First, aider is awesome -- kudos to you for developing it! And thanks for releasing it! :-)
Second: is there a good way for us to provide data for you about this reliability problem?
For 0.18, using 4-turbo and the SEARCH/REPLACE model, aider worked great (Django, standalone python, Laravel, HTML/CSS, HTMX, vanilla JS)
Since upgrading to 0.19, and now 0.20, using the diff edit model, I have had zero successful edits. I've tried on multiple Django projects and a Laravel project.
I'm going to downgrade to 0.18 for now, but if there is any sort of useful information or data that we can provide to you, please let me know. I'd love to help improve the app!
Thanks again!
@jimcraner thanks for the info on the problems you are having.
You can try the latest version of aider v0.21.0 which has some improvements to the unified diff editing format. Alternatively, you can always run aider with --model gpt-4-1106-preview --edit-format diff
to use the old SEARCH/REPLACE edit format with gpt 4 turbo.
I would love any concrete examples you have of editing failures. To be most useful, I need:
.aider.chat.history.md
.I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.
Hello team,
I just recently updated to Aider v0.19.1 and I see that it no longer uses SEARCH/REPLACE from python and instead it implements something more closely resembling git diffs.
What I have noticed is for me in my existing project that I have been using aider for it fails to do a valid change basically 100% of the time. What I notice is that it always leave broken pieces of the old code immediately after the new code that it inserts. I have not had a single successful insert since yesterday. Here is an example of what I have been seeing since yesterday.
This is a part of the diff generated for the above change:
You can see that the diff just completely ignored the header lines that were already there when it should have "-" removed them.
This is happening on every single change that aider does in my project.