Closed piwling closed 4 years ago
Hi, Yes, different read lengths will affect things because (1) It is easier to detect homology to the ribosomal proteins on longer reads. SIngleM looks specifically for regions that are easy to detect in the windowing operation, but the issue does still remain. (2) Do you imagine that the lack of clustering is a bug in the software, or just reality? 16S methods tend to oversimplify the community compared to SingleM in my experience, but it's hard for me to know whether what you are seeing is unexpected because I know little about your samples. If you directly query a rarefied sample against another instead of clustering, is there much more overlap?
Hi, (1) My samples were all about marine sediment metagenomes, and they were download from NCBI. So the read lengths vary greatly. (2) I wanna to find more common OTUs between these samples. When I directly query a rarefied sample against another instead of clustering, there is much less overlap.
If querying with the same divergence as the clustering uses, then unfortunately that's just the reality of the situation - your samples do not share many OTUs, at least at that level of rarefaction. This doesn't appear to be a bug in singlem then, which is happy for me, but less happy for you I guess.
Hi, If my samples do not share many OTUs, that's certainly a pity for me. Thank you all the same.
Hi, Sorry to bother you again. (1)If the raw reads are different in length, 150bp, 250bp and so on, will this have an impact on the results?Because the raw reads are from different sequencing platforms. (2)After cluster(o.8999) and rarefied(number to choose 100), I found that there are very little commom OTUs between samples. Is there a way to increase the common OTUs between samples? Thank you! Pi Weiling