Fixes to not have to reinstall testbeds and conda envs

@aorwall thanks so much for the original contribution and patience.

While we didn't merge the original code, we accounted for this feature in the new swebench=2.0.0 release.

The docker image caching mechanism takes care of this:

Control how the harness caches images between runs
You can cache three different tiers of images: base, environment, and instance. For running on full SWE-bench test, the number of images associated w/ each tier are as follows:
- base: 1 image that is the base image which all instances are built from
- environment: 60 images w/ conda environments that all together cover any and all environments used by instances
- instance: 2294 images (one per instance), which is just env image + installation of repository at base_commit of instance

The report has more advice on how to appropriately set the cache level.

In a nutshell, this feature has been incorporated in swebench>=2.0.0. Now, with enough storage, instance-specific images can be cached, allowing 2+ evaluation runs of SWE-bench to be completed very quickly.

princeton-nlp / SWE-bench

Fixes to not have to reinstall testbeds and conda envs #109

Reference Issues/PRs

What does this implement/fix? Explain your changes.