wwood / singlem

Novelty-inclusive microbial community profiling of shotgun metagenomes
http://wwood.github.io/singlem/
GNU General Public License v3.0
136 stars 17 forks source link

raw reads & common OTUs #38

Closed piwling closed 4 years ago

piwling commented 5 years ago

Hi, Sorry to bother you again. (1)If the raw reads are different in length, 150bp, 250bp and so on, will this have an impact on the results?Because the raw reads are from different sequencing platforms. (2)After cluster(o.8999) and rarefied(number to choose 100), I found that there are very little commom OTUs between samples. Is there a way to increase the common OTUs between samples? Thank you! Pi Weiling

wwood commented 5 years ago

Hi, Yes, different read lengths will affect things because (1) It is easier to detect homology to the ribosomal proteins on longer reads. SIngleM looks specifically for regions that are easy to detect in the windowing operation, but the issue does still remain. (2) Do you imagine that the lack of clustering is a bug in the software, or just reality? 16S methods tend to oversimplify the community compared to SingleM in my experience, but it's hard for me to know whether what you are seeing is unexpected because I know little about your samples. If you directly query a rarefied sample against another instead of clustering, is there much more overlap?

piwling commented 5 years ago

Hi, (1) My samples were all about marine sediment metagenomes, and they were download from NCBI. So the read lengths vary greatly. (2) I wanna to find more common OTUs between these samples. When I directly query a rarefied sample against another instead of clustering, there is much less overlap.

wwood commented 5 years ago

If querying with the same divergence as the clustering uses, then unfortunately that's just the reality of the situation - your samples do not share many OTUs, at least at that level of rarefaction. This doesn't appear to be a bug in singlem then, which is happy for me, but less happy for you I guess.

piwling commented 5 years ago

Hi, If my samples do not share many OTUs, that's certainly a pity for me. Thank you all the same.