nhoffman / dada2-nf

A Nextflow pipeline for processing 16S rRNA sequences using dada2
0 stars 2 forks source link

Support for ITS using vsearch alignments #54

Closed crosenth closed 2 years ago

crosenth commented 2 years ago

Includes test files

nhoffman commented 2 years ago

@crosenth - this all looks good to me based on the diffs - let's get input from @dhoogest

dhoogest commented 2 years ago

Looks good on read through to me too. I do wonder if we'll need to adapt the data/dada_params_250.json to suit the ITS amplicons. See also data/dada_params_ngs16s.json.

Note @mwohl - we'll want to open an issue in the NGS16S project to accommodate the renamed target_f/r columns in counts.csv

dhoogest commented 2 years ago

@crosenth I reformatted params-single.json into both params-single-vsearch.json and params-single-cmsearch.json, specifying the model/library alternatively in each. The CI script has been updated as well to run both of these - fingers crossed it passes. I think the only other addition would be a check to compare the counts.csv output in output-cmsearch and output-vsearch (or similar) - I did this locally to confirm equivalence but we could in theory add a diff or something to the CI to build in as a formal test. /cc @nhoffman

crosenth commented 2 years ago

@dhoogest - This will only pass if the files are exactly the same: https://github.com/nhoffman/dada2-nf/pull/54/commits/33851a1118fa7d282e3576e236e1d793269c3af7

dhoogest commented 2 years ago

@nhoffman I'm gonna go head and merge this. Think its got everything we discussed!