Open kifeonu opened 4 years ago
I think this issue can be closed now! To process 454 data, one can add flags
--dadaOpt.HOMOPOLYMER_GAP_PENALTY -1 --dadaOpt.BAND_SIZE 32
to Nextflow, which will run dada
as
dada(..., HOMOPOLYMER_GAP_PENALTY=-1, BAND_SIZE=32)
recommended in the tutorial.
@wbazant any ideas on test data sets for this one? We could add it to CI testing (which will be critical to have in place for DSL2 work)
Right, since TADA doesn't do single end now, the added dadaOpt.XXX feature adds support only hypothetical paired-end 454 data, which is not even a thing in the 454 technology!
For single end 454, SRS607719 is a stool sample containing mostly E.coli, we have it under https://microbiomedb.org/mbio/app/record/sample/MBSMPL0020-7-1 .
It weighs about 1MB, and it's available from ftp.sra.ebi.ac.uk/vol1/fastq/SRR128/009/SRR1288519/SRR1288519.fastq.gz
@wbazant I added some prelim single-end read support, including via a sample sheet. Also supports PacBio (which we can set using the --platform
parameter. So this should feasibly support 454 out of the box, though we may want to have some presets for this and PacBio added at some point.
Make provision to use this pipeline to process 454 data