joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

Getting loadings from PCoA #655

Closed nmshahir closed 7 years ago

nmshahir commented 8 years ago

Hello,

I am a doctoral student working on a microbiomes project and did an principal coordinates analysis in Phyloseq. While I recognize that PCA and PCoA are not exactly the same, I was wondering if there was a way to get the loadings of the PCoA (i.e. determine how much taxa A, taxa B, etc contribute to PC1, and so forth)?

Code Example: ord.wuni <- ordinate(data,"PCoA","wunifrac") PCoA.wuni = plot_ordination(data, ord.wuni, type = "samples", color = "Phenotype") PCoA.wuni

Thanks in advance for any comments, Nur

spholmes commented 8 years ago

The reason we use DPCoA instead of wuf is because it provides biplots with species loadings that PCoA on wuf doesn't.

On Tue, Aug 16, 2016 at 11:22 AM, nmshahir notifications@github.com wrote:

Hello,

I am a doctoral student working on a microbiomes project and did an principal coordinates analysis in Phyloseq. While I recognize that PCA and PCoA are not exactly the same, I was wondering if there was a way to get the loadings of the PCoA (i.e. determine how much taxa A, taxa B, etc contribute to PC1, and so forth)?

Code Example: ord.wuni <- ordinate(data,"PCoA","wunifrac") PCoA.wuni = plot_ordination(data, ord.wuni, type = "samples", color = "Phenotype") PCoA.wuni

Thanks in advance for any comments, Nur

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/655, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvcWsd9NK_zph-SQ2-sJGPkSHbHEjks5qgf_2gaJpZM4JltKL .

Susan Holmes Professor, Statistics and BioX John Henry Samter Fellow in Undergraduate Education Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

nmshahir commented 8 years ago

I'm going through the documentation (https://github.com/joey711/phyloseq/wiki/ordinate) but it's still a little unclear as to how DPCoA differs from PCoA?

spholmes commented 8 years ago

You need to read this paper that lays it out very nicely:

http://www.ncbi.nlm.nih.gov/pubmed/22174277

On Wed, Aug 17, 2016 at 9:38 AM, nmshahir notifications@github.com wrote:

I'm going through the documentation (https://github.com/joey711/ phyloseq/wiki/ordinate) but it's still a little unclear as to how DPCoA differs from PCoA?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/655#issuecomment-240470747, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvcg9YuoVL1jdTgBtfQKIYoCx2C5uks5qgzkIgaJpZM4JltKL .

Susan Holmes Professor, Statistics and BioX John Henry Samter Fellow in Undergraduate Education Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

nmshahir commented 8 years ago

Thank you @spholmes. I've gone through the paper and I have a few questions

  1. It seems that instead of using UniFrac/weighted UniFrac as the distance metric, DPCoA utilizes a patristic distance as the metric?
  2. The loadings corresponding to species loadings are name_of_dpcoa$dw , correct?
  3. It was stated here ( https://github.com/joey711/phyloseq/issues/305 ), that you can also use CCA and RDA to obtain OTU loadings? Is there a particular benefit to using DPCoA over CCA and RDA? Or is it more of a case by case situation depending on the question you wish to ask?

Thank you! -N

spholmes commented 8 years ago
  1. It is correct, that is what DPCOA does uses first a patricstic distance to do a MDS/PCoA then take these points with the abundances as weights and compute the centres of gravities for sample point.
  2. The loadings correspond to taxa as required.
  3. If you consider the OTUs as independent categories it is fine to use Correspondence analyses (CCA without formula).

    Canonical correspondence analyses and RDA are quite different as they involve extra explanaotry variables/environmental factors/ etc..in a formula.

best Susan

On Mon, Aug 22, 2016 at 7:53 PM, nmshahir notifications@github.com wrote:

Thank you @spholmes https://github.com/spholmes. I've gone through the paper and I have a few questions

  1. It seems that instead of using UniFrac/weighted UniFrac as the distance metric, DPCoA utilizes a patristic distance as the metric?
  2. The loadings corresponding to species loadings are name_of_dpcoa$dw , correct?
  3. It was stated here ( #305 https://github.com/joey711/phyloseq/issues/305 ), that you can also use CCA and RDA to obtain OTU loadings? Is there a particular benefit to using DPCoA over CCA and RDA? Or is it more of a case by case situation depending on the question you wish to ask?

Thank you! -N

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/655#issuecomment-241512882, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvUBg9SkfH8CiuGfFp8QmKaxvqcEzks5qifA2gaJpZM4JltKL .

Susan Holmes Professor, Statistics and BioX John Henry Samter Fellow in Undergraduate Education Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/