Update evaluation.md evaluation argument hints

princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?

https://www.swebench.com

MIT License

1.81k stars 311 forks source link

Update evaluation.md evaluation argument hints #93

Closed ssh-randy closed 5 months ago

ssh-randy commented 5 months ago

What does this implement/fix? Explain your changes.

QOL update to evaluation readme, so users know they can directly use the string "test" or "dev" for the swe_bench_task arg, so that users don't need to upload any files.