Open unode opened 5 years ago
In order to keep compatibility with the current behavior (no action when finished), I'm wondering if this should be implemented through a --only-collect
command-line option.
Effectively we have to skip all actions (preprocess
, map
, fastq
, paired
, ...) except collect
but, we still need to have a sample name for collect
to act upon.
Over the long term, I would prefer an approach where, whenever ngless
runs¸ it will create any missing outputs. The whole lock1
/collect
business is a bit of a hack now. This is probably for NGLess 2, though.
Scenario:
1) 12 samples are being processed using the
parallel
machinerylock1()
andcollect()
. 2) 10 samples complete and 2 fail. 3) The 2 failing samples are considered bad and are excluded from the sample file.At this point re-running ngless has no effect since all work is complete however the merged output from
collect()
was never generated.collect()
can also fail to occur in rare cases where the last two samples finish almost simultaneously or filesystem lag prevents the last two processes from seeing all samples as complete.