logic-star-ai / swt-bench

[NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation
https://openreview.net/forum?id=9Y8zUO11EQ&noteId=9Y8zUO11EQ
MIT License
16 stars 2 forks source link

Pin versions for docker images #13

Open nielstron opened 4 days ago

nielstron commented 4 days ago

Describe the bug

Some docker images are failing due to lack of pinned versions. This is a list from SWE-Bench cases (running updated for migrated fixes)

Steps/Code to Reproduce

python -m src.main --predictions_path gold --max_workers 1 --run_id validate-gold --instance_ids <affected_instance>

Expected Results

Should succeed in building

Actual Results

May fail due to missing pin of dependencies

System Information

No response