princeton-nlp / SWE-bench

[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
2k stars 348 forks source link

Incorrect Issue Description for Instance 'astropy__astropy-14182' #219

Closed SmartManoj closed 3 weeks ago

SmartManoj commented 2 months ago

Describe the bug

The issue for instance astropy__astropy-14182 was created using an old version, leading to mismatched traceback line numbers. This has confused in identifying the exact location of the error. The issue should have been generated using the base_commit version to ensure accurate line numbers in the traceback.

Steps/Code to Reproduce

from astropy.table import QTable
import astropy.units as u
import sys
tbl = QTable({'wave': [350,950]*u.nm, 'response': [0.7, 1.2]*u.count})
tbl.write(sys.stdout,  format="ascii.rst")
tbl.write(sys.stdout,  format="ascii.rst", header_rows=["name", "unit"])

Expected Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/astropy/table/connect.py", line 130, in __call__
    self.registry.write(instance, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/registry/core.py", line 385, in write
    return writer(data, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/connect.py", line 28, in io_write
    return write(table, filename, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/ui.py", line 975, in write
    writer = get_writer(Writer=Writer, fast_writer=fast_writer, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/ui.py", line 901, in get_writer
    writer = core._get_writer(Writer, fast_writer, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/core.py", line 1815, in _get_writer
    writer = Writer(**writer_kwargs)
TypeError: RST.__init__() got an unexpected keyword argument 'header_rows'

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/astropy/table/connect.py", line 129, in __call__
    self.registry.write(instance, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/registry/core.py", line 369, in write
    return writer(data, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/connect.py", line 26, in io_write
    return write(table, filename, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/ui.py", line 856, in write
    writer = get_writer(Writer=Writer, fast_writer=fast_writer, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/ui.py", line 800, in get_writer
    writer = core._get_writer(Writer, fast_writer, **kwargs)
  File "/usr/lib/python3/dist-packages/astropy/io/ascii/core.py", line 1719, in _get_writer
    writer = Writer(**writer_kwargs)
TypeError: RST.__init__() got an unexpected keyword argument 'header_rows'

System Information

N/A

john-b-yang commented 3 weeks ago

Hi @SmartManoj, thanks for the insight. This is a really interesting catch, and it's very cool you caught this difference.

With that said, I think small time gaps between issue reports and the base commit of a pull request are quite common. The creation date of the issue and a pull request are usually not simultaneous, and it's realistic that an issue report will be a couple commits old (but the fact the issue is tied to the PR implies that the issue remained unresolved at the PR's base commit).

If anything, this is a small but meaningful challenge of SWE-bench that (1) reflects the realities of open source dev and (2) suggests the value of execution-based methods.

Since this discrepancy is a natural product of the SWE-bench collection scheme, and the actual results you include is the verbatim issue report, I believe there's nothing "wrong" about the issue text, and it's best to leave the problem_statement as is.

Thank you again for pointing out a really interesting niche challenge in SWE-bench!

SmartManoj commented 3 weeks ago

Yes, I understand your point, but there seems to be a misunderstanding regarding time gaps here. There are no time gaps in this case. The author did not use the latest version when reporting the issue. It’s crucial to ensure that the latest version is being used when reporting, as some issues might already be resolved in newer commits. This ensures that the report is accurate and reflective of the current state of the codebase.

john-b-yang commented 3 weeks ago

But isn't this a reality as well? That sometimes the user is not on the latest version.

In short, I think the "bug" you're pointing out is a natural challenge of open source maintenance - the maintainer who fixed the issue had to deal with the same problem. I think we're in agreement on the asynchronous nature of the user report. I'm just saying that "repairing" the problem statement is not necessary because this is a real challenge of SWE. It's not unfair.

Although it's generally true the latest version might've resolved an issue from before, the fact that this issue is explicitly linked with this PR naturally implies that it wasn't fixed in later commits.