Aider-AI / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
19.94k stars 1.82k forks source link

Can't handle rst #1803

Open boosh opened 6 days ago

boosh commented 6 days ago

Issue

The equals signs in .rst files mess up aider's search/replace tokens, so it's unable to apply edits, e.g.:

<<<<<<< SEARCH
=======
=======
New heading
=======
This is some new text
>>>>>>> REPLACE

Version and model info

Aider: v0.57.1

akaihola commented 5 days ago

Also, this seems to happen even if the number of equal signs in the heading underline doesn't match the Aider search/replace separator.

akaihola commented 5 days ago

@boosh, I made it simpler to change the divider strings in #1817. I guess the simplest fix for the problem with reStructuredText would be to extend the middle divider from just ======= to e.g. ======= <<SEARCH >>REPLACE, but I don't know how much it would increase the frequency of badly formatted code edit responses from LLMs.

I'm also wondering whether Anthropic's advice to Use XML tags to structure your prompts would also apply to requested response formats. Would Claude also give higher quality responses using pseudo XML tags, similarly to supposed higher quality "understanding" of prompts structured with those?

akaihola commented 5 days ago

By the way, I believe the change in 7fa1620f58132ec085a7939a8015bbe7935827a2 ("feat: Allow flexible matching of 5-9 characters in SEARCH/REPLACE block prefixes" on Sep 20) made reStructuredText handling worse:

-HEAD = "<<<<<<< SEARCH"
-DIVIDER = "======="
-UPDATED = ">>>>>>> REPLACE"
+HEAD = r"<{5,9} SEARCH"
+DIVIDER = r"={5,9}"
+UPDATED = r">{5,9} REPLACE"
+    head_pattern = re.compile(HEAD)
+    divider_pattern = re.compile(DIVIDER)
+    updated_pattern = re.compile(UPDATED)

It seems earlier the divider line had to match exactly, so there had to be no less and no more than seven = signs. But now with the regex, any line beginning with 5 = signs is treated as a middle divider.

akaihola commented 5 days ago

The smallest change to minimize the reStructuredText problem would be probably this:

diff --git a/aider/coders/editblock_coder.py b/aider/coders/editblock_coder.py
index 118759e9..9305d405 100644
--- a/aider/coders/editblock_coder.py
+++ b/aider/coders/editblock_coder.py
@@ -412,7 +412,7 @@ def find_original_update_blocks(content, fence=DEFAULT_FENCE, valid_fnames=None)
     current_filename = None

     head_pattern = re.compile(HEAD)
-    divider_pattern = re.compile(DIVIDER)
+    divider_pattern = re.compile(fr"{DIVIDER}[ ]*$")
     updated_pattern = re.compile(UPDATED)

     while i < len(lines):

This fixes the issues for reStructuredText heading underlines with more than 9 = symbols. But if you have headings of 5 to 9 characters, it will still fail, like for example for this heading:

What?
=====

The code change above is quite similar to what the splitting pattern already does:

separators = "|".join([HEAD, DIVIDER, UPDATED])

split_re = re.compile(r"^((?:" + separators + r")[ ]*\n)", re.MULTILINE | re.DOTALL)
akaihola commented 5 days ago

Building on top of #1817, I experimented with pseudo XML dividers, and at least for my first manual test it worked beautifully:

Divider configuration in editblock_coder.py:

    edit_block_config = {
        "search_divider": "<AIDER-SEARCH>",
        "middle_divider": "</AIDER-SEARCH> <AIDER-REPLACE>",
        "replace_divider": "</AIDER-REPLACE>"
    }

Aider session:

> /add README.rst                                                                                                                                                                                                                 

Added /home/akaihola/prg/graylint/README.rst to the chat
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
README.rst                                                                                                                                                                                                                        
> Under the "What?" heading, add a funny intro.                                                                                                                                                                                   

Certainly! I'll add a funny intro under the "What?" heading in the README.rst file. Here's the change:                                                                                                                            

README.rst                                                                                                                                                                                                                        

 <AIDER-SEARCH>                                                                                                                                                                                                                   
 What?                                                                                                                                                                                                                            
 =====                                                                                                                                                                                                                            

 This utility runs linters on Python source code files.                                                                                                                                                                           
 </AIDER-SEARCH> <AIDER-REPLACE>                                                                                                                                                                                                  
 What?                                                                                                                                                                                                                            
 =====                                                                                                                                                                                                                            

 Imagine a world where your code is so clean, even your rubber duck approves. Welcome to Graylint, the lint detector that's more thorough than your grandmother inspecting your room!                                             

 This utility runs linters on Python source code files.                                                                                                                                                                           
 </AIDER-REPLACE>                                                                                                                                                                                                                 

This change adds a humorous introduction to the "What?" section, setting a lighthearted tone for the utility's description.                                                                                                       

Tokens: 23k sent, 153 received. Cost: $0.07 message, $0.07 session.
Applied edit to README.rst

Resulting Graylint README.rst excerpt:

What?
=====

Imagine a world where your code is so clean, even your rubber duck approves. Welcome to Graylint, the lint detector that's more thorough than your grandmother inspecting your room!

This utility runs linters on Python source code files.