Removing chimeras from Sequel full-length 16S rRNA data using DADA2

yuharasatoshi commented 3 years ago

And we are now analyzing the data according to your paper. https://benjjneb.github.io/LRASManuscript/LRASms_Zymo.html

However, the majority of the data was lost after removing chimeras.

ccs primers filtered denoised [1,] 222830 197310 189307 185756

bim <- isBimeraDenovo(dd, minFoldParentOverAbundance=3.5) table(bim)

bim FALSE TRUE 34 162

Any advice or comments would be appreciated.

benjjneb commented 3 years ago

What fraction of the reads were lost?

sum(dd$denoised[bim])/sum(dd$denoised)

yuharasatoshi commented 3 years ago

Thank you for your reply.

sum(dd$denoised[bim])/sum(dd$denoised) 0.03910054

benjjneb commented 3 years ago

4% of reads being identified as chimeras is totally normal. Not a cause for concern.

yuharasatoshi commented 3 years ago

Thank you for your comment.

According to the analysis log, 196 variants were obtained after denoizing.

189,307 reads in 38,141 unique seqs. 196 seq variants were inferred from 38,141 input unique seqs.

165 out of 196 variants were removed after removeBimeraDenovo. Is it also normal?

benjjneb commented 3 years ago

Many ASVs but few reads being chimeric is totally normal, especially in low diversity samples like a mock community.

Lots of very low abundance chimeras can be produced by PCR. That's what you are seeing.

benjjneb / dada2