lmrodriguezr / nonpareil

Estimate metagenomic coverage and sequence diversity
http://enve-omics.ce.gatech.edu/nonpareil/
Other
42 stars 11 forks source link

Throat metagenomes #50

Closed andrewjmc closed 2 years ago

andrewjmc commented 2 years ago

Hello,

I have 150bp paired end reads from throat metagenomes. Human reads have been removed and I have quality trimmed. Depth is highly variable because human % varies.

I ran Nonpareil with overlap 100 and default -S (0.95). Approximately half of the samples give the following warning and then fatal error:

 [     35.6]  Evaluating consistency
 [     35.6]  WARNING: The curve reached near-saturation, hence coverage estimations could be unreliable
 [     35.6]  The overlap (-L) is currently set to the maximum, meaning that the actual coverage is probably above 100X
 [     35.6]  You could increase -S but values other than 0.95 are untested
Fatal error:
Sequencing depth above detection limit.

I don't fully understand what it means for depth to be "above" detection limit, and the sense (or risk) of increasing -S. I am also confused that average coverage is reported in % in the Nonpareil curves, but in this error it's reported as a fold value.

I also am confused whether I should consider this just a warning, or as a fatal error, totally disregard the output?

Thanks,

Andrew

lmrodriguezr commented 2 years ago

Hello @andrewjmc This is a confusing message that we should fix in future releases.

It simply means that you coverage is so high (above 95%) that the actual value can be inaccurate. For example, the estimated coverage could be 98% when the real coverage is 99.9% (or the other way around). For values very near saturation, that might be an important caveat, but in most cases what we need to know is (1) if the coverage is above a certain threshold (e.g., >60% or >80%) and (2) what's the sequence diversity index of the community. For both questions the results are reliable, and you can simply ignore this warning.

Best, Miguel.

andrewjmc commented 2 years ago

Really helpful response, thanks!