nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
188 stars 119 forks source link

Proposal: Seqkit stats when demultiplexing #611

Open a4000 opened 1 year ago

a4000 commented 1 year ago

Description of feature

I know Ampliseq already uses fastqc, but I like using Seqkit stats when demultiplexing to track the number of reads that did and didn't get assigned to samples. There is another issue proposing a demultiplex step that I hope to resolve, so I figured I'd propose Seqkit stats for when the user chooses to demultiplex.

d4straub commented 1 year ago

In that case I propose a demultiplexing subworkflow that contains cutadapt & seqkit to bundle the tools. I agree that it would be good to track stats for demuptiplexing, probably even a warning when a large amount of reads wasnt assigned to anything.

a4000 commented 1 year ago

I agree on the subworklow. There are other modules I might need to add to the subworkflow to get around a problem I've identified with demultiplexing. I'll make another issue to go into more details about that problem, and my proposed solution