Closed SmartManoj closed 3 weeks ago
Hi @SmartManoj, thanks for the insight. This is a really interesting catch, and it's very cool you caught this difference.
With that said, I think small time gaps between issue reports and the base commit of a pull request are quite common. The creation date of the issue and a pull request are usually not simultaneous, and it's realistic that an issue report will be a couple commits old (but the fact the issue is tied to the PR implies that the issue remained unresolved at the PR's base commit).
If anything, this is a small but meaningful challenge of SWE-bench that (1) reflects the realities of open source dev and (2) suggests the value of execution-based methods.
Since this discrepancy is a natural product of the SWE-bench collection scheme, and the actual results you include is the verbatim issue report, I believe there's nothing "wrong" about the issue text, and it's best to leave the problem_statement
as is.
Thank you again for pointing out a really interesting niche challenge in SWE-bench!
Yes, I understand your point, but there seems to be a misunderstanding regarding time gaps here. There are no time gaps in this case. The author did not use the latest version when reporting the issue. It’s crucial to ensure that the latest version is being used when reporting, as some issues might already be resolved in newer commits. This ensures that the report is accurate and reflective of the current state of the codebase.
But isn't this a reality as well? That sometimes the user is not on the latest version.
In short, I think the "bug" you're pointing out is a natural challenge of open source maintenance - the maintainer who fixed the issue had to deal with the same problem. I think we're in agreement on the asynchronous nature of the user report. I'm just saying that "repairing" the problem statement is not necessary because this is a real challenge of SWE. It's not unfair.
Although it's generally true the latest version might've resolved an issue from before, the fact that this issue is explicitly linked with this PR naturally implies that it wasn't fixed in later commits.
Describe the bug
The issue for
instance astropy__astropy-14182
was created using an old version, leading to mismatched traceback line numbers. This has confused in identifying the exact location of the error. The issue should have been generated using the base_commit version to ensure accurate line numbers in the traceback.Steps/Code to Reproduce
Expected Results
Actual Results
System Information
N/A