Closed pi-null-mezon closed 6 months ago
I have found the solution:
Step 1
mv ./inference
python run_api.py --dataset_name_or_path princeton-nlp/SWE-bench_Lite_oracle --model_name_or_path claude-3-haiku-20240307 --output_dir ../_datasets_/output/lite/haiku
Step 2
mv ./swebench/harness
python run_evaluation.py --predictions_path ../../_datasets_/output/lite/haiku/claude-3-haiku-20240307__SWE-bench_Lite_oracle__test.jsonl --swe_bench_tasks ../../_datasets_/swe-bench.json --log_dir ../../_datasets_/logs --testbed ../../_datasets_/workdir --verbose
thought at Step 2 errors related to miniconda appears:
2024-04-04 15:38:09,122 - testbed_context_manager - INFO - [Testbed] Using conda path /home/alex/Programming//SWE-bench/_datasets_/workdir/claude-3-haiku-20240307/psf__requests/2.3/tmpcmt6e7pt/miniconda3
Error: Command '['/home/alex/Programming/SWE-bench/_datasets_/workdir/claude-3-haiku-20240307/psf__requests/2.3/tmpcmt6e7pt/miniconda3/bin/conda', 'env', 'list']' returned non-zero exit status 1.
Error stdout:
Error stderr: Traceback (most recent call last):
File "/home/alex/Programming//SWE-bench/_datasets_/workdir/claude-3-haiku-20240307/psf__requests/2.3/tmpcmt6e7pt/miniconda3/bin/conda", line 12, in <module>
from conda.cli import main
ModuleNotFoundError: No module named 'conda'
Hi, I had the same question, @pi-null-mezon , how did you find the swe-bench.json
file for swe-bench-lite?
You do not need this json. Just call evaluation with princeton-nlp/SWE-bench_Lite_oracle
. It will be downloaded from hugginface.
Okay got it, thank you!
@pi-null-mezon apologies this wasn't more clear in the documentation, thanks so much for providing.
I will update the documentation now to indicate that huggingface SWE-bench datasets can be provided as an argument.
Feel free to re-open if there are any follow up questions!
I have managed to run evaluation according to the tutorial with
swe-bench.json
. But with the lite version of swe-bench it does not work because benchmark files files for lite version are stored in*.arrow
format instead of*.json
. So, how to run evaluation for swe-bench_lite?