princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.47k stars 241 forks source link

swe-bench eval stops running after a point #102

Closed ssh-randy closed 1 month ago

ssh-randy commented 2 months ago

Describe the issue

Hey! I'm trying to run swe-bench remotely on a google compute engine VM (running on an n2-highmem-4 using the common-core image). However, no matter how i configure it it always seems to fail at around 170ish test cases, and gets this error:

subprocess.CalledProcessError: Command '. /workspace/GPT-4-Turbo/django/3.2/tmpu4_g7e7p/miniconda3/bin/activate djangodjango3.2 && echo 'activate successful' && pip install -r /workspace/GPT-4-Turbo/django/3.2/tmp91t3yumh/requirements.txt' returned non-zero exit status 1.

I'm running the latest version of swe-bench, and i'm building it from the same image as what's used in the swe-agent repository. Happy to post any logs as well if you'd like to try and debug this

Suggest an improvement to documentation

No response

ssh-randy commented 1 month ago

closing issue and reopening as a bug