princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.81k stars 311 forks source link

Do not attempt to mutate dataset object #108

Closed waterson closed 5 months ago

waterson commented 5 months ago

It turns out that load_from_disk returns a Dataset object that looks like a list but isn't. So trying to mutate the dicts in place silently fails to do anything at all.

Instead of trying to mutate the dataset in-place (which we'll copy into a list anyway), just call a helper function that will deal with mangling the dict as necessary while we enumerate dataset.

Fixes #107.