benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
459 stars 142 forks source link

% of chimera detected seems too high #1828

Closed AnaMariaCabello closed 3 months ago

AnaMariaCabello commented 11 months ago

Dear Ben,

I have been processing metabarcoding data of the V4- 18S rDNA region (96 environmental seawater samples from a time-series). I found that chimeras represented the 70% of total ASVs (#33400) and 5,29% of total reads. Although the amount of lost reads is not dramatically high, I don't think I have ever seen such a high amount of spurious ASVs. Do you think these numbers are problematic and if so, there is a problem with the raw data? maybe related to any issue with the sequencing machine (the sequencing company mentioned that they had an issue with one of the indexing plates, in fact, one of the samples in this dataset initially got were few reads and the offered to sequence it again) I'm attaching the error plots, in case they help.

errors_07_rev.pdf errors_07_fwd.pdf

Thank you in advance!

benjjneb commented 11 months ago

5% of reads being chimeric is well within the expected reange.

Chimeras often appear in great diversity but with low abundances, hence they can account for a high fraction of total ASVs in some datasets. The exact amount depends on an interaction between how diverse the sampled communities are and how many chimeras are created. I have seen 70% of total ASVs being chimeras in the past. Given the reasonable fraction of reads in the chimeric ASVs, I don't see a red flag here.

AnaMariaCabello commented 11 months ago

Thank you so much Ben!

On 10 Oct 2023, at 16:19, Benjamin Callahan @.***> wrote:

5% of reads being chimeric is well within the expected reange.

Chimeras often appear in great diversity but with low abundances, hence they can account for a high fraction of total ASVs in some datasets. The exact amount depends on an interaction between how diverse the sampled communities are and how many chimeras are created. I have seen 70% of total ASVs being chimeras in the past. Given the reasonable fraction of reads in the chimeric ASVs, I don't see a red flag here.

— Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1828#issuecomment-1755524810, or unsubscribe https://github.com/notifications/unsubscribe-auth/APMXFJA5RCRN3ZLWRJRVW3LX6VKQNAVCNFSM6AAAAAA52GTYJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJVGUZDIOBRGA. You are receiving this because you authored the thread.