bokulich-lab / nf-ducken

Workflow to process amplicon meta-analysis data, from NCBI accession IDs to taxonomic diversity metrics.
3 stars 2 forks source link

Update Cutadapt to take multiple primers #97

Closed lina-kim closed 7 months ago

lina-kim commented 7 months ago

Closes #83.

Previously, when run with $n$ samples and $m$ primers, this workflow ran $n*m$ Cutadapt processes for every possible sample-primer pair. Now, the workflow takes all primers as a single, space-delimited input into a single Cutadapt process depending on how many artifacts the samples are split into. As it turns out, native Cutadapt allows for this and selects the best trimmed output; this is unclear in the q2-cutadapt docs.

SPLIT_FASTQ_MANIFEST is brought back to offer users the option to run sample sets as a single FASTQ artifact or multiple to optimize processing speed.