EESI / themetagenomics

Other
23 stars 4 forks source link

How to identify OTU within-topic clusters #13

Open YinchengChen23 opened 4 years ago

YinchengChen23 commented 4 years ago

Thanks for developing this convenient package. I am so glad for using this package, but I had some difficulties that how to extract the OTU from each topic, I followed the methods from https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0219235#pone.0219235.ref021, and used the beta proportions table topic_proportions <- exp(TOPICS$fit$beta$logbeta[[1]]) to calculate Bray Curtis distance and do the hclust, then cuttree to characterize the OTU belongs to which topic. However, I am not sure this OTU clustering methods from this journal whether exactly the same in your package. so how could I know these OTUs belongs to which topic generated by this package. Thanks!

sw1 commented 4 years ago

I'm not completely sure what you're asking. Running "topic_proportions <- exp(TOPICS$fit$beta$logbeta[[1]])" will get you the proportions of all OTUs in a given topic, so a given topic/row vector will sum to 1. You can calculate distances on this matrix (or use PCA, ICA, etc.) and do dimensionality reduction or build a tree. But this will tell you which topics are similar to each other. The high-frequency OTUs in a given topics are the result of the clustering done by the topic model.

I doubt I answered your question. Could you possibly rephrase it?