Closed biopaw closed 1 year ago
+1 Completely agree, this would be really valuable. The inability to perform raw data QC as you normally would is something that holds my group back from using the nf-core/rnaseq pipeline when working at scale.
I love this idea. I asked a question about this in the nf-core slack and it was pointed out that the taxprofiler pipeline can do QC only without any alignment, but it would be really nice to have this functionality in the rnaseq pipeline as well.
Description of feature
Use Case:
It appears that at the moment the workflow must necessarily perform alignment, if
--skip_alignment
is true, then--pseudo_aligner
must be populated with salmon .The most common use case for rna-seq count generation, especially for larger sets of samples, is (1) to perform QC run first, (2) evaluate the quality control reports, (3) adjust sample manifest and/or pass additional trimming parameters (4) run a final count generation run (with qc,for final qc reports). Running alignment prior to quality control assessment, may waste a lot of time and resources, as then 2 complete runs of the workflow end to end would need to be performed.
Enhancement:
A very simple enhancement, would be to add a flag for skipping pseudoalignment, so that together adding:
--skip_alignment true
and--skip_pseudoalignment true
will make sure only the quality control steps that have been specified are getting completed. If I was a little further along with working with the nextflow development, I would offer to help with this now; it is preferable for someone more experienced to this for now.
ETA
Can someone add this wee feature relatively soon, as it is needed for dealing with a several large datasets I need to process.