Open mrgazzara opened 2 years ago
Hi @mrgazzara, good point. One quick fix could be to run several samples at once, e.g. having the sample table with all Mayr samples. In this way the makeTFfasta should only be executed once. Or is this how you executed it so far? Alternatively, I can imagine to restructure nextflow in that makeTFfasta would be treated as a parameter and only executed/created when not existing. Similar to what @faricazjj did in IsoSCM, see here: https://github.com/iRNA-COSI/APAeval/blob/0152d1dbe0ab8176f75d3a995d1f2a3b80b50019/execution_workflows/IsoSCM/conf/modules.config#L28
Yes the latter is exactly what I had in mind. We also do something very similar with the QAPA EWF where you can chose to build a new 3'UTR annotation file or re-use an existing annotation. Something like that would be a huge time saver for users of our Labrat EWF.
Labrat takes very long time to run in its current implementation of the execution workflow. This is because the makeTFfasta step takes several days (~3.5 when I ran it). The subsequent Salmon steps take next to no time to run. It would be a useful feature to allow users to pass to the execution workflow the result of the
makeTFfasta
step from a previous execution to speed things up. ThemakeTFfasta
step should only have to be run once per annotation version.Any thoughts on this possibility @yuukiiwa or @dominikburri ?