joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
579 stars 188 forks source link

Adonis ranking issue #966

Open raw937 opened 6 years ago

raw937 commented 6 years ago

Hello,

I am finding a weird bug where it's the position of the metadata variable that can make it significant or not.

I can't explain it. It depends on where I put the order of Organic Matter vs. Moisture Content.

Call: adonis(formula = d2 ~ Organic_Matter + Moisture_Content + Molybdenum + Iron + Yield + Height + Plot, data = df)

Permutation: free Number of permutations: 999

Terms added sequentially (first to last)

1. Df_ SumsOfSqs MeanSqs F.Model R2 Pr(>F)
Organic_Matter 1 3.3471 3.3471 26.5049 0.38699 0.001 ***
Moisture_Content 1 0.2121 0.2121 1.6794 0.02452 0.131
Molybdenum 1 0.1463 0.1463 1.1585 0.01691 0.278
Iron 1 0.1490 0.1490 1.1801 0.01723 0.291
Yield 1 0.1840 0.1840 1.4570 0.02127 0.136
Height 1 0.2638 0.2638 2.0887 0.03050 0.054 .
Plot 1 0.1796 0.1796 1.4219 0.02076 0.160
Residuals 33 4.1673 0.1263 0.48182
Total 40 8.6491 1.00000**

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Call:

adonis(formula = d2 ~ Moisture_Content + Organic_Matter + Molybdenum + Iron + Yield + Height + Plot, data = df)

Permutation: free Number of permutations: 999

Terms added sequentially (first to last)

Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
Moisture_Content 1 3.0826 3.08263 24.4107 0.35641 0.001 ***
Organic_Matter 1 0.4765 0.47654 3.7736 0.05510 0.008 **
Molybdenum 1 0.1463 0.14629 1.1585 0.01691 0.263
Iron 1 0.1490 0.14902 1.1801 0.01723 0.262
Yield 1 0.1840 0.18399 1.4570 0.02127 0.159
Height 1 0.2638 0.26377 2.0887 0.03050 0.061 .
Plot 1 0.1796 0.17956 1.4219 0.02076 0.168
Residuals 33 4.1673 0.12628 0.48182
Total 40 8.6491 1.00000

spholmes commented 6 years ago

I think this is covered by the following answer on SE:

https://stats.stackexchange.com/questions/188519/adonis-in-vegan-order-of-variables-or-use-of-strata

if the factors are not balanced the ordering will matter.

Best Susan

On Wed, Jul 11, 2018 at 12:41 PM, Richard Allen White III < notifications@github.com> wrote:

Hello,

I am finding a weird bug where it's the position of the metadata variable that can make it significant or not.

I can't explain it. It depends on where I put the order of Organic Matter vs. Moisture Content.

Call: adonis(formula = d2 ~ Organic_Matter + Moisture_Content + Molybdenum + Iron + Yield + Height + Plot, data = df)

Permutation: free Number of permutations: 999

Terms added sequentially (first to last)

         Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)

Organic_Matter 1 3.3471 3.3471 26.5049 0.38699 0.001 *** Moisture_Content 1 0.2121 0.2121 1.6794 0.02452 0.131 Molybdenum 1 0.1463 0.1463 1.1585 0.01691 0.278 Iron 1 0.1490 0.1490 1.1801 0.01723 0.291 Yield 1 0.1840 0.1840 1.4570 0.02127 0.136 Height 1 0.2638 0.2638 2.0887 0.03050 0.054 . Plot 1 0.1796 0.1796 1.4219 0.02076 0.160 Residuals 33 4.1673 0.1263 0.48182 Total 40 8.6491 1.00000

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

adonis(d2 ~ Moisture_Content + Organic_Matter + Molybdenum + Iron + Yield

  • Height + Plot, df)

Call: adonis(formula = d2 ~ Moisture_Content + Organic_Matter + Molybdenum + Iron + Yield + Height + Plot, data = df)

Permutation: free Number of permutations: 999

Terms added sequentially (first to last)

         Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)

Moisture_Content 1 3.0826 3.08263 24.4107 0.35641 0.001 * Organic_Matter 1 0.4765 0.47654 3.7736 0.05510 0.008 Molybdenum 1 0.1463 0.14629 1.1585 0.01691 0.263 Iron 1 0.1490 0.14902 1.1801 0.01723 0.262 Yield 1 0.1840 0.18399 1.4570 0.02127 0.159 Height 1 0.2638 0.26377 2.0887 0.03050 0.061 . Plot 1 0.1796 0.17956 1.4219 0.02076 0.168 Residuals 33 4.1673 0.12628 0.48182 Total 40 8.6491 1.00000

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/966, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJcvRGpdiqAYvnJlQVqp7LSkSSHQ2vlks5uFlTXgaJpZM4VLupM .

-- Susan Holmes John Henry Samter Fellow in Undergraduate Education Professor, Statistics 2017-2018 CASBS Fellow, Sequoia Hall, 390 Serra Mall Stanford, CA 94305 http://www-stat.stanford.edu/~susan/

MSMortensen commented 6 years ago

adonis is order dependent. This means that first, it tests how much of all variation is explained by the first variable (and if it is significant), and then it tests how much of the remaining variation is explained by the second variable, and so on. Therefore the order of your variables will have a huge impact on the significance and the amount of variation explained by each variable.

Best, Martin