Closed waterson closed 3 months ago
Hi @waterson thanks for the suggestion + contribution and your patience!
We have just published a major release swebench==2.0.0
today which should take care of this issue. I totally agree - the motivation behind this fix for the original code makes a lot of sense.
I left a more extensive reply to @aorwall at #109. In a nutshell, with swebench.harness.run_evaluation
, it is now possible to cache [base, env, instance] images.
Put simply, if SWE-bench evaluation is run 2+ times, if the instance
tier of images are cached, then environments don't need to be rebuilt at all.
Also, the env
layer of images represent conda environments that multiple instances use, and this intermediate layer is what makes creating instance
tier images a lot more efficient.
Thanks again for the issue - it provided the confirmation we needed to move forward with incorporating this feature, I really appreciate it a lot!
This change provides a
path_conda
to use for the eval in the testbed directory that will be reused across evaluations, and modifies the context manager's behavior so that a non-existentpath_conda
will be initialized and populated in the same way that a temporary context would be.I realize that it might make some sense to have a bit of discussion about optionalizing this behavior, but I wanted to just get something out there to talk about. :)
Fixes #104.