DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
894 stars 241 forks source link

Symlink pasthrough on WDL is broken again #5031

Open adamnovak opened 1 month ago

adamnovak commented 1 month ago

I think as of #4994, and somewhere in the diff between b3a016c3fd03ee23712145c4b381d34db1f0d407 and fb3b6304932f82df272bb0fc315235bf9e836eac, we added a regression and #4850 came back and WDL tasks can no longer output symlinks to their inputs. Test 72 our conformance tests, which tests this, started failing again, but we didn't notice in CI because the test was never marked passing.

We should fix the symlink passthrough again.

We should also maybe set up CI so that if we have a WDL conformance test marked as failing, it's an error if it doesn't fail, to ensure we keep the list up to date when we make fixes.

┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-1620

stxue1 commented 1 month ago

My guess to why this broke is due to some weird trickery in #4994 while dealing with the sentinel value.

5028 overhauls the way files are virtualized in toil-wdl-runner, possibly resulting in the sentinel value being obsolete (I still have to figure out what merging #4994 from master into #5028 did behaviorally), so the fix will probably be (replacing the sentinel value? except for task boundaries?) by effectively reverting to ae49ee5d11d971703cbad1c7f4afdfe00eea3e5f while implementing a new solution to #4988 (likely before the virtualize_files call and outputs, which is where the optional coerced files will likely be used)