claraqin / neonMicrobe

Processing NEON soil microbe marker gene sequence data into ASV tables.
GNU Lesser General Public License v3.0
9 stars 4 forks source link

Mixed orientation of reads in R1 and R2 files #5

Closed claraqin closed 4 years ago

claraqin commented 4 years ago

Hi everyone,

It seems that some of the reads in the R1 files are actually reverse reads, and some of the reads in the R2 files are actually forward reads. See this table of primer orientations in R1 and R2 files for runB69PP:

                   Forward Complement Reverse RevComp
FWDPrimer.R1.reads     423          0       0       3
FWDPrimer.R2.reads      26          0       0      13
REVPrimer.R1.reads     647          0       0     298
REVPrimer.R2.reads     737          0       0       4

This is an issue common to Illumina sequencing known as mixed orientation. DADA2 does not have a built-in way to handle this, but see https://github.com/benjjneb/dada2/issues/938

claraqin commented 4 years ago

Potentially useful discussion in a DADA2 GitHub Issue thread: https://github.com/benjjneb/dada2/issues/671

There isn't a full-fledged re-orientation solution in the dada2 package at this point. So if you can do that before (or after) demultiplexing, but before you start the dada2 pipeline, it would probably be best.

claraqin commented 4 years ago

Since not all reads contain primers or anything else that could hint to their orientation, we might only be able to address this at the taxonomy assignment stage. Use arg assignTaxonomy(..., tryRC=TRUE). Closing this issue for now.