on the vsearch forum, user Emily Van Syoc asked how to interpret the summary message produced by the --uchime_denovo command.
Here is a toy-example with three unique sequences (or cluster representatives), representing 100 reads in total: parentA with 50 reads; parentB with 49 reads; and chimeraAB with 1 read. chimeraAB is a chimera of parentA and parentB:
Found 1 (33.3%) chimeras, 2 (66.7%) non-chimeras,
and 0 (0.0%) borderline sequences in 3 unique sequences.
Taking abundance information into account, this corresponds to
1 (1.0%) chimeras, 99 (99.0%) non-chimeras,
and 0 (0.0%) borderline sequences in 100 total sequences.
As expected, one of the three sequence is marked as a chimera (33.3%). When taking into account the number of reads each sequence represents, the percentage of the dataset marked as chimeric is only 1% (1 read out of 100). Discarding chimeras preserves 99.0% of the initial dataset (99 reads, represented by parentA and parentB).
on the vsearch forum, user Emily Van Syoc asked how to interpret the summary message produced by the
--uchime_denovo
command.Here is a toy-example with three unique sequences (or cluster representatives), representing 100 reads in total:
parentA
with 50 reads;parentB
with 49 reads; andchimeraAB
with 1 read.chimeraAB
is a chimera ofparentA
andparentB
:As expected, one of the three sequence is marked as a chimera (33.3%). When taking into account the number of reads each sequence represents, the percentage of the dataset marked as chimeric is only 1% (1 read out of 100). Discarding chimeras preserves 99.0% of the initial dataset (99 reads, represented by
parentA
andparentB
).