Closed harrytormey closed 5 months ago
There is a final_report.json file for each swe-agent-replication. The "resolved" in the final_report.json field represents the resolved task instances in SWE-bench lite. The other .traj files represent the all actions taken by SWE-agent, and conversation history with GPT-4. At the end of a .traj file, there is an "info" field, containing the generated patch (in the form of git diff) if exist.
Closing this, @harrytormey feel free to let us know if you have more questions.
I am planning on writing an article on Auto Code Rover and I was wondering if you could tell me about the format of the SWE-bench test results in: https://github.com/nus-apr/auto-code-rover/tree/main/results/swe-agent-results How am I to interpret the results in this directory? Specifically for Devin they formatted diffs for their SWE-bench run into separate pass/fail directories: https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs How is this done for your results? Thanks in advance and thanks for publishing your work.
-Harry