stjudecloud / workflows

Bioinformatics workflows developed for and used on the St. Jude Cloud project.
MIT License
34 stars 10 forks source link

Add intentional pre and post checks for file emptiness #177

Open a-frantz opened 3 months ago

a-frantz commented 3 months ago

It's a relatively common standard in the bioinformatics tools we wrap to not have any special handling for empty inputs or outputs (including headered files without content/alignments). These "empty" files can currently be passed around between many of our tools resulting in a bunch of no-ops and a tough-to-track class of errors.

We should have lightweight checks before and after the "main computation" of our tasks to ensure that only "populated" files are being processed. We'll need a short investigation to find out which tools this wrapping will be needed for, as it would likely be redundant with some tools that error as one might expect on empty inputs and outputs.

This is related to #172 , and closing this issue should probably close #172 as well.