Closed aorwall closed 3 months ago
When I rebuilt the sphinx images to test all SWE-Bench Verified instances all newly built images started to fail. Seems to be because of a new version of tox. This PR is to pin the version in the previously built images. Which seems to work.
tox
Compare this benchmark run with failing sphinx instances: https://eval.moatless.ai/evaluations/fcbb473f957149e7a5a45baeaa11800c
With this fix it looks like this: https://eval.moatless.ai/evaluations/ab7a61b78e334a759fff3b344d30bd70
I reproduced this and your changes work! Thanks so much!
Thanks Albert!
When I rebuilt the sphinx images to test all SWE-Bench Verified instances all newly built images started to fail. Seems to be because of a new version of
tox
. This PR is to pin the version in the previously built images. Which seems to work.Compare this benchmark run with failing sphinx instances: https://eval.moatless.ai/evaluations/fcbb473f957149e7a5a45baeaa11800c
With this fix it looks like this: https://eval.moatless.ai/evaluations/ab7a61b78e334a759fff3b344d30bd70