princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.45k stars 240 forks source link

Get error "error: corrupt patch at line 40" when using the gold patch of "django__django-15202" #145

Closed BoxiYu closed 2 days ago

BoxiYu commented 1 week ago

Describe the bug

I try to use the gold patch of "django__django-15202" by using git apply test.patch, where I put the gold patch into the file test.patch. But I get error: corrupt patch at line 40 after the command.

Steps/Code to Reproduce

diff --git a/django/core/validators.py b/django/core/validators.py --- a/django/core/validators.py +++ b/django/core/validators.py @@ -108,15 +108,16 @@ def call(self, value): raise ValidationError(self.message, code=self.code, params={'value': value})

     # Then check full URL

Expected Results

The patch has been used to the django repository.

Actual Results

error: corrupt patch at line 40

System Information

CodeSpace of swe-agent

BoxiYu commented 1 week ago

The gold patch is here:

diff --git a/django/core/validators.py b/django/core/validators.py
--- a/django/core/validators.py
+++ b/django/core/validators.py
@@ -108,15 +108,16 @@ def __call__(self, value):
             raise ValidationError(self.message, code=self.code, params={'value': value})

         # Then check full URL
+        try:
+            splitted_url = urlsplit(value)
+        except ValueError:
+            raise ValidationError(self.message, code=self.code, params={'value': value})
         try:
             super().__call__(value)
         except ValidationError as e:
             # Trivial case failed. Try for possible IDN domain
             if value:
-                try:
-                    scheme, netloc, path, query, fragment = urlsplit(value)
-                except ValueError:  # for example, "Invalid IPv6 URL"
-                    raise ValidationError(self.message, code=self.code, params={'value': value})
+                scheme, netloc, path, query, fragment = splitted_url
                 try:
                     netloc = punycode(netloc)  # IDN -> ACE
                 except UnicodeError:  # invalid domain part
@@ -127,7 +128,7 @@ def __call__(self, value):
                 raise
         else:
             # Now verify IPv6 in the netloc part
-            host_match = re.search(r'^\[(.+)\](?::\d{1,5})?$', urlsplit(value).netloc)
+            host_match = re.search(r'^\[(.+)\](?::\d{1,5})?$', splitted_url.netloc)
             if host_match:
                 potential_ip = host_match[1]
                 try:
@@ -139,7 +140,7 @@ def __call__(self, value):
         # section 3.1. It's defined to be 255 bytes or less, but this includes
         # one byte for the length of the name and one byte for the trailing dot
         # that's used to indicate absolute names in DNS.
-        if len(urlsplit(value).hostname) > 253:
+        if splitted_url.hostname is None or len(splitted_url.hostname) > 253:
             raise ValidationError(self.message, code=self.code, params={'value': value})
john-b-yang commented 2 days ago

Hi @BoxiYu I could not replicate this bug.

Code to write the test, gold patches to a file:

from datasets import load_dataset

swebench = load_dataset('princeton-nlp/SWE-bench', split='test')
swebench = {x['instance_id']: x for x in swebench}

instance = swebench["django__django-15202"]
with open('fix.patch', 'w') as f:
    f.write(instance['patch'])
with open('test.patch', 'w') as f:
    f.write(instance['test_patch'])
print(instance['base_commit'])

I then cloned the repo via !git clone git@github.com:django/django.git

And checked out the base commit git checkout 4fd3044ca0135da903a70dfb66992293f529ecf1

I then cd-ed into the django/ folder and ran git apply test.patch and git apply fix.patch

They both applied successfully:

% git apply ../test.patch
% git apply ../fix.patch
% git status
HEAD detached at 4fd3044ca0
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
    modified:   django/core/validators.py
    modified:   tests/forms_tests/field_tests/test_urlfield.py

no changes added to commit (use "git add" and/or "git commit -a")

Can you try this and see if you still get an error? I'm quite certain it applies correctly, but perhaps I missed something.

BoxiYu commented 2 days ago

@john-b-yang I found that I missed a blank line. It is solved. Thank you!

john-b-yang commented 2 days ago

Sweet, no problem, glad to hear that!