Open ilivyatan opened 7 months ago
Hi @ilivyatan this should be intrisecally possible with nextflow if you keep your work directory by providing -resume
. You can run the analysis again, adding or changing the parameters as you wish, and the workflow should be able to recognise what's been already run (in this case the --snp
analysis) and simply execute the new steps. However, if you were to change the parameters for the --snp
analysis, the workflow would have to repeat all or some of these steps as well.
Hi @RenzoTale88 Is it possible to run --str alone without -resume. I ran the version v. 1.2 earlier for SNP, SV, and mod calling, and now I want to do str calling. I only specified --str but the sample is running snp calling as well as per the logs that I am seeing.
Is your feature related to a problem?
Yes Maintaining consistency of analyses in a reasonable timeframe.
Describe the solution you'd like
I've ran the full pipeline on samples, but forgot to designate the --sex parameter for the --str analysis. So I want to just run the --str part again, and have it use the necessary inputs that it has already generated. Yet, running the pipeline again, designating only --str, starts rerunning the --snv analysis and haplotagging the BAM file, which takes a long time. It would be great if it could locate the files it needs in the 'output' folder and just run the specific analysis.
Describe alternatives you've considered
I've run the straglr independently. It takes less than 30 seconds for a 30x covered human genome... and another 15 min for phasing with longphase and annotation with stranger. This is a solution, but is less streamlined and doesn't produce the nice reports that epi2me does, and only some of the samples need to be repeated, so having a uniformity of analysis is important. Another alternative could be to enable the reporting tools as command line.
Additional context
I run epi2me via nextflow command line on the promethion24 machine. Since snv analysis is a precursor to the other types of analyses, maybe there can be an option to designate whether snv analysis has already run, and supply the result files, so that the pipeline can continue with additional analyses. For example, a routine could look like this: First run only --snv, check out the results, and then run again with --sv. (Phasing can be run at the end to connect everything up.)