torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
656 stars 122 forks source link

After combine two datasets together to recluster, some of the abundance of high-abundant OTUs were reduced dramatically. #514

Open peiyaohu opened 1 year ago

peiyaohu commented 1 year ago

I have two datasets A and B, where there is a high-abundance OTU (id: OTU_54) in dataset A. In order to compare the abundance of OTU_54 in the two datasets, I put the raw sequencing data of A and B together (=>A+B), followed the example steps provided on the website to cluster (the parameters are the same as when A and B analyzed), and found that the OTU_54 in the original A dataset had very low abundance in the otutab(A+B) produced by the new clustering.

So I blast all.nonchimeras.fasta (the file before cluster at 97% similarity) of A, B and A+B with OTU_54, and filtered the blast results according to identity > 97%, alignment length>300, and checked the number of matches, and found that A+B lost a lot of OTU_54.

wc -l filt_nonchim*               # filtered blast results. 
   76966 filt_nonchim18.txt   #generated from datasetB
  157240 filt_nonchim19.txt  #generated from datasetA
   12369 filt_nonchim.txt       #generated from A+B

How can I address or optimize the analysis process? Thanks!

frederic-mahe commented 1 year ago

and found that the OTU_54 in the original A dataset had very low abundance in the otutab(A+B) produced by the new clustering.

This is a known downside of using a centroid-based fix-threshold clustering approach: some clusters shrink or disappear when adding more data.

A given centroid1 can be abundant in a sample A, but close to a more abundant centroid2 present in a sample B. If you clusterize A+B, then centroid2 captures some or all the reads initially captured by centroid1.

and checked the number of matches, and found that A+B lost a lot of OTU_54.

If I understand correctly, reads from OTU_54 are not lost, but were re-distributed into other OTUs. There is not much that can be done to mitigate that downside.

peiyaohu commented 1 year ago

Thanks so much!