vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
461 stars 97 forks source link

Adonis-function #124

Closed ANNALISAS88 closed 9 years ago

ANNALISAS88 commented 9 years ago

Hello, I´m a Phd student from Germany and I have a question concerning the Adonis-function out of vegan package. I tried to run a Multivariate Analysis of Covariance with my nonparametric dataset, but a problem arised. The R programm says this, every time I want to run the analysis :

ergebnis <- adonis(SICIEEG ~ Gruppe + AlterTMS, data=daten) Fehler in rowSums(x, na.rm = TRUE) : 'x' must be an array of at least two dimensions

SICIEEG stands for the Y and is a continuous variable, Gruppe = group and is a nominal variable and AlterTMS= age and is also continuous. Do someone have any ideas what could be the problem concerning the analysis? I would really appreciate it, if somebody could help me, because this anaylsis isn´t very common and all other statistic softwares like SPSS aren´t able to run a Analysis of Covariance with a nonparametric dataset. Thank you for your help.

King regards

Anna Lisa

gavinsimpson commented 9 years ago

If SICIEEG is a continuous variable that smells like a vector, singular, to me. The error message is exactly the same as that generated when I pass adonis() a single variable:

> data(dune)
> data(dune.env)
> adonis(dune[,1] ~ Management*A1, data=dune.env, permutations=99)
Error in rowSums(x, na.rm = TRUE) : 
  'x' must be an array of at least two dimensions

You say you want to do a "Multivariate Analysis of Covariance" so why do you pass in a univariate response?

FYI; data are not "nonparametric", tests may be.

ANNALISAS88 commented 9 years ago

Hello,

thank you for your answer. I´m not very used to statistics and I wanted to try first the univariate analysis, to test, if I have the correct analysis. In total I have 8 dependent variables. Did you find a solution for the problem? Thank you for your help. Anna Lisa

gavinsimpson commented 9 years ago

If your hypothesis is that the levels of Gruppe have different "composition" of variables but the same AlterTMS "effect" then you should just fit the multivariate model. If you want there to be an interaction (different effects of AlterTMS for each level of Gruppe) then you could also look at that via the following RHS to the formula: Grupper * AlterTMS.

If you want to fit univariate models then adonis() is not the tool for you; even if you do:

adonis(dune[,1 ,drop = FALSE] ~ Management*A1, data=dune.env, permutations=99)

to by-pass the dimensions issue, we still run into problems because the code assumes, requires from what I understand!, multiple response variables.

> adonis(dune[,1 ,drop = FALSE] ~ Management*A1, data=dune.env, permutations=99)

Call:
adonis(formula = dune[, 1, drop = FALSE] ~ Management * A1, data = dune.env,      permutations = 99) 

Permutation: free
Number of permutations: 99

Terms added sequentially (first to last)

              Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
Management     3                                    
A1             1                                    
Management:A1  3                                    
Residuals     12                                    
Total         19                                    
Warning messages:
1: In vegdist(lhs, method = method, ...) :
  you have empty rows: their dissimilarities may be meaningless in method “bray”
2: In vegdist(lhs, method = method, ...) : missing values in results

If you just want a non-parametric ANCOVA, you could fit the model using lm() and then do a permutation test of that model.

jarioksa commented 9 years ago

This could be the same issue as discussed (and solved) in http://stackoverflow.com/questions/28570380/adonis-function-from-vegan-doesnt-work

In most cases, adonis is not the easiest and best choice for univariate responses. In particular, you must be careful to use Euclidean distances with univariate variables.

Function rda can handle univariate responses and give the same results as standard lm, but uses permutation tests to evaluate the significance in its anova function.

ANNALISAS88 commented 9 years ago

Thank your for your good suggestions. It worked. But now I have further more questions. With the cbind-syntax I pooled all my 7 dependent variables together into one variable I called X. My result look like this: > adonis(X ~ Gruppe + AlterTMS, data=EEGundEMGDaten, permutations=99)

Call: adonis(formula = X ~ Gruppe + AlterTMS, data = EEGundEMGDaten, permutations = 99)

Permutation: free Number of permutations: 99

Terms added sequentially (first to last)

      Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)  

Gruppe 1 0.0279 0.02792 0.4018 0.00872 0.71
AlterTMS 1 0.3263 0.32626 4.6944 0.10184 0.02 * Residuals 41 2.8495 0.06950 0.88944

Total 43 3.2037 1.00000

Signif. codes: 0 ‘_**’ 0.001 ‘_’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

  1. Is there any syntax, that shows me each dependent variable seperate. For my analysis it is important, if the age has different effects of the 7 dependent variables. In this output I just see the summarised form of the variables as a result. I tried it with the syntax print, but it showed the same output like before.
  2. I excluded von dependent variable because of missing values. Is there any option, to mark the missing values, that I can calculate the analysis with the variable, but without the two missing values?
  3. A question concerning the permutations. Are there any rules, in which case I have to choose permutations=99 or permutations=999? What does it mean exactly?

Thank you so much for your help.

King regards

Anna Lisa Schneider

eduardszoecs commented 9 years ago

Is there any syntax, that shows me each dependent variable seperate. For my analysis it is important, if the age has different effects of the 7 dependent variables. In this output I just see the summarised form of the variables as a result. I tried it with the syntax print, but it showed the same output like before

Might be worth to consider some model-based approaches: e.g. ?manova() or the mvabund package. Both give you multivariate statistics and univariate statistics. Is the Bray-Curtis-Distance (default in adonis) suitable for our question?

I excluded von dependent variable because of missing values. Is there any option, to mark the missing values, that I can calculate the analysis with the variable, but without the two missing values?

See ?na.omit or ?complete.cases to exclude observations with NAs.

A question concerning the permutations. Are there any rules, in which case I have to choose permutations=99 or permutations=999? What does it mean exactly?

With 99 permutations your lowest p-value you can achieve is 1/100 = 0.01, with 999 it is 1/1000 = 0.001. Generally you can say: The more the better / more exact, but higher computation demand. If your dataset is small or you have time you could run all possible permutations.

ANNALISAS88 commented 9 years ago

Hello, thank you for your answer. The problem is a have a nonparametric dataset. Is the package manova also for nonparametric data? I thought not, at least in SPSS it is just for parametric data. I think the adonis-function ist the right, but my problem is that a need an output, where you have the possibilty to see the effect of age and group for each dependent variable. At the moment I just see the effect for the variable X, which is the summary of all my dependet variables. X <- cbind (ICFEEG,SICIEEG,EPEEG_N100,LICIEEG,EPEMG,ICFEMG,SICIEMG)

I tried to seperate all my dependent variables with commas, but this syntax didn´t work. It looked like this: adonis(ICFEEG,SICIEEG,EPEEG_N100,LICIEEG,EPEMG,ICFEMG,SICIEMG ~ Gruppe + AlterTMS, data=EEGundEMGDaten, permutations=99).

King regards Anna Lisa

gavinsimpson commented 9 years ago

The problem is a have a nonparametric dataset.

No, you categorically do _not_ have a nonparametric data set. Those words do not make any statistical sense. You might have data that has certain properties or features that would violate the assumptions of a parametric test, and for which a non-parametric test/method might be more appropriate.

And just because a test might be parametric, that often only refers to the process of making statistical inference from the results of the test; interpreting the significance or otherwise of the obtained test statistic. You can often perform a parametric test but use computer resampling (permutations, bootstrapping, other resampling) to perform inference on the test statistic.

A question concerning the permutations. Are there any rules, in which case I have to choose permutations=99 or permutations=999? What does it mean exactly?

With 99 permutations your lowest p-value you can achieve is 1/100 = 0.01, with 999 it is 1/1000 = 0.001. Generally you can say: The more the better / more exact, but higher computation demand. If your dataset is small or you have time you could run all possible permutations.

An additional, and often paramount, consideration is the exchangeability of samples under the null hypothesis; can you really randomize the data under the correct Null hypothesis?

ANNALISAS88 commented 9 years ago

Hello,

sorry, I´m very unexperienced concerning statistics. This is the initial situation: For my thesis we ran a MANCOVA with SPSS, but the reviewer complained about the fact, that our dataset isn´t normally distributed. Now my supervisior wants me to calculate the dataset with a comparable nonparametric test. I looked it up and I couldn´t find any programm which is able to run this type of analysis. Then I found the vegan-package in R. I know, that it would be possible to run a parametric test and explain the violations. But the my supervisior will not accept it. At the moment I´m very helplessly.

ANNALISAS88 commented 9 years ago

Actually I just wanted an output from adonis, where you can see each dependent variable seperate. That would be the solution. I thought this ist the right one: print.adonis shows the aov.tab component of the output. But when I try this, every time occurs error: > print.adonis Error: object 'print.adonis' not found

Thank you!

gavinsimpson commented 9 years ago

You don't call the methods directly as these tend not to be exported from package namespaces. Instead just use:

print(obj)

where obj is the name of the R object you assigned the output of adonis() to, e.g.

obj <- adonis(....)