benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
459 stars 142 forks source link

MaxEE filtering bias? #1125

Closed Ateuchus closed 3 years ago

Ateuchus commented 4 years ago

Hello,

I have just read the paper 'Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing' (Prodan et al., 2020) and the authors recommend not using maxEE filtering for 16s V4 data as it may introduce a bias (under-estimating the relative abundance of ASVs due higher base call error rates than would be expected for upstream base patterns such as 'GGC' triplets or inverted repeats of more than 8 bases long). This makes sense to me, and based on this paper I'm thinking it's best to follow suit for my 16s V4 data (as using maxEE=2 didn't improve the specificity of their output) - however, I am not sure whether doing the same thing with the 18s would also be a good idea? I am assuming that the same thing could happen here too, but haven't found anything testing this and am admittedly not familiar enough with the 18s V9 region.

Any advice would be greatly appreciated, thank you!

link to paper: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227434

benjjneb commented 4 years ago

the authors recommend not using maxEE filtering for 16s V4 data as it may introduce a bias (under-estimating the relative abundance of ASVs due higher base call error rates than would be expected for upstream base patterns such as 'GGC' triplets or inverted repeats of more than 8 bases long)

Let me start off by saying I disagree with this. The potential to introduce bias via these factors in normal situations is low, much lower than the bias that already exists, and the benefits of maxEE filtering versus other styles like minQ are preferable.

however, I am not sure whether doing the same thing with the 18s would also be a good idea? I am assuming that the same thing could happen here too, but haven't found anything testing this and am admittedly not familiar enough with the 18s V9 region.

Sure, the same thing could happen in 18S, don't see why not.

Ateuchus commented 4 years ago

I see, I'll use the maxEE filtering then- thank you for your reply!