DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
894 stars 241 forks source link

Fix bad caching to prevent already devirtualized files getting lost #4965

Closed stxue1 closed 3 months ago

stxue1 commented 3 months ago

Closes #4959

We were caching already devirtualized files in the virtualized-to-devirtualized cache, which caused conflicts when the task/job cleaned up its files but the workflow attempts to access it again from the "virtualized" url.

This separates the caching a bit so it only adds it to the cache when we actually have a virtualized file to devirtualize. This also lets the devirtualization step start using the cache, which it wasn't doing before.

Changelog Entry

To be copied to the draft changelog by merger:

Reviewer Checklist

Merger Checklist