AndersenLab / alignment-nf

A nextflow pipeline for genome sequences alignment
MIT License
1 stars 0 forks source link

processes in module triples submission #14

Closed danrlu closed 4 years ago

danrlu commented 4 years ago

Regarding the use of modules: in the trim-fq-nf, each file runs its own md5sum process in parallel. So for 274 libraries, there were 274 processes for pre-md5, 274 processes for post-md5, and 274 processes for trimming. (Note: no need to change the trimming pipeline, I already ran all the files...)

image

In alignment-nf, we have > 1500 strain-level libraries, and > 10 QC processes for each of them. That will total 15,000 job submissions to SLURM for QC processes alone. It may be more efficient to include the QC step within alignment/merge process to avoid the extra job submission steps.

danielecook commented 4 years ago

this is related to trim-fq-nf, not alignment-nf

danielecook commented 4 years ago

The md5sum process generates a digest for both files:

https://github.com/AndersenLab/trim-fq-nf/blob/master/md5.module.nf

So the fact that there are 274 is expected.