Enable running --str (or other subcomponents of the pipeline) more modularly

Is your feature related to a problem?

Yes Maintaining consistency of analyses in a reasonable timeframe.

Describe the solution you'd like

I've ran the full pipeline on samples, but forgot to designate the --sex parameter for the --str analysis. So I want to just run the --str part again, and have it use the necessary inputs that it has already generated. Yet, running the pipeline again, designating only --str, starts rerunning the --snv analysis and haplotagging the BAM file, which takes a long time. It would be great if it could locate the files it needs in the 'output' folder and just run the specific analysis.

Describe alternatives you've considered

I've run the straglr independently. It takes less than 30 seconds for a 30x covered human genome... and another 15 min for phasing with longphase and annotation with stranger. This is a solution, but is less streamlined and doesn't produce the nice reports that epi2me does, and only some of the samples need to be repeated, so having a uniformity of analysis is important. Another alternative could be to enable the reporting tools as command line.

Additional context

I run epi2me via nextflow command line on the promethion24 machine. Since snv analysis is a precursor to the other types of analyses, maybe there can be an option to designate whether snv analysis has already run, and supply the result files, so that the pipeline can continue with additional analyses. For example, a routine could look like this: First run only --snv, check out the results, and then run again with --sv. (Phasing can be run at the end to connect everything up.)

epi2me-labs / wf-human-variation