princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.8k stars 312 forks source link

Fix newline outputs for django's log parser #166

Closed xingyaoww closed 3 months ago

xingyaoww commented 3 months ago

Reference Issues/PRs

Fix https://github.com/princeton-nlp/SWE-bench/issues/165.

What does this implement/fix? Explain your changes.

Tries to parse test case multi-line outputs, similar to the one shown in https://github.com/princeton-nlp/SWE-bench/issues/165 that breaks the test case parsing.

Any other comments?

I confirmed it fixes the instance described in https://github.com/princeton-nlp/SWE-bench/issues/165. To reproduce, you can check https://github.com/OpenDevin/OpenDevin/pull/2728.

john-b-yang commented 3 months ago

Sweet, thanks a bunch @xingyaoww for this contribution!

Yeah, the Django output parsing for tests is a bit weird, this definitely catches a few that aren't covered by the existing parser. Hopefully there aren't any more!