Closed SZU-ZJW closed 2 months ago
Is your error similar to the one in #77 or any of the issues mentioned on that thread (full disclaimer, i'm not a maintainer of swebench, just trying to get it to work successfully)
Thank you for your reply. Regarding the second question, the suggestion I found on the Internet is to reinstall miniconda, but this obviously doesn‘’t work because miniconda is inherently temporary. There is still no good solution to the first problem.
@SZU-ZJW What's the actual error for the first problem? You've just included the stack trace
subprocess.CalledProcessError: Command '. /home/zjw/SWE-Bench/main/evaluation/testbed/SWE-Llama-7B/sphinx-doc__sphinx/7.1/tmp0si9huh/miniconda3/bin/activate sphinx-doc__sphinx__7.1 && conda install gxx_linux-64 gcc_linux-64 make -y' returned non-zero exit status 2.
This is the reported issue, I don't know how to correct it, it should be in SWE-bench/swebench/harness
There is line 389 of /context_manager.py. Faced with this problem, I am a little at a loss.
@SZU-ZJW yeah that's the command that failed (i.e., from the python code, it tried to create a subprocess with the command . /home/zjw/SWE-Bench/main/evaluation/testbed/SWE-Llama-7B/sphinx-doc__sphinx/7.1/tmp0si9huh/miniconda3/bin/activate sphinx-doc__sphinx__7.1 && conda install gxx_linux-64 gcc_linux-64 make -y
, but that command failed (i.e., it returned non-zero exit status 2
), but there should be an actual error that is output saying why that command failed. Can you include the full log of that run?
testbed_sphinx_7.1.log Of course, this is the log file about this problem.
Hi @brombaut @SZU-ZJW we just released a report on the fixes we've been working on to get SWE-bench evaluation to work reliably, you can read about it here.
Based on what you've detailed in this issue, I think this is likely related to failure mode 2. It seems to me that you're running on an arm machine? Given that this command is failing:
conda install gxx_linux-64 gcc_linux-64 make -y
You could potentially try commenting out this line to see if the evaluation can still work without having to install the arch_specific_packages
we specified.
thank you for your answer, it really solve my problem!
Describe the bug
When I use the run_evaluation.py to evaluate the results, I get an error. The dataset is SWE-bench-BM25-13K and the model is SWE-Llama-7B.
Looking forward to reply, any reply will be a huge help to me and worth thanking.
Steps/Code to Reproduce
code in harness/run_evaluation.py.
Expected Results
This error should not occur and the command can be executed correctly. Miniconda shouldn't have a problem either.
Actual Results
An unexpected error occurred
System Information
Linux, Python 3.9