Closed Ateuchus closed 3 years ago
the authors recommend not using maxEE filtering for 16s V4 data as it may introduce a bias (under-estimating the relative abundance of ASVs due higher base call error rates than would be expected for upstream base patterns such as 'GGC' triplets or inverted repeats of more than 8 bases long)
Let me start off by saying I disagree with this. The potential to introduce bias via these factors in normal situations is low, much lower than the bias that already exists, and the benefits of maxEE
filtering versus other styles like minQ
are preferable.
however, I am not sure whether doing the same thing with the 18s would also be a good idea? I am assuming that the same thing could happen here too, but haven't found anything testing this and am admittedly not familiar enough with the 18s V9 region.
Sure, the same thing could happen in 18S, don't see why not.
I see, I'll use the maxEE filtering then- thank you for your reply!
Hello,
I have just read the paper 'Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing' (Prodan et al., 2020) and the authors recommend not using maxEE filtering for 16s V4 data as it may introduce a bias (under-estimating the relative abundance of ASVs due higher base call error rates than would be expected for upstream base patterns such as 'GGC' triplets or inverted repeats of more than 8 bases long). This makes sense to me, and based on this paper I'm thinking it's best to follow suit for my 16s V4 data (as using maxEE=2 didn't improve the specificity of their output) - however, I am not sure whether doing the same thing with the 18s would also be a good idea? I am assuming that the same thing could happen here too, but haven't found anything testing this and am admittedly not familiar enough with the 18s V9 region.
Any advice would be greatly appreciated, thank you!
link to paper: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227434