vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
449 stars 97 forks source link

Multiple issues of adonis2 for PERMANOVA and possibility for upgrading the function #699

Open menarest opened 3 hours ago

menarest commented 3 hours ago

Hi, I was reading an article comparing adonis2 for PERMANOVA in R with the PERMANOVA function developed by the author of PRIMER (a paid, closed-source application). The article highlights several limitations of adonis2, which could provide useful insights for improving its functionality. Thought you might find this helpful!

https://learninghub.primer-e.com/books/should-i-use-primer-or-r/chapter/3-permanova-vs-adonis2-in-r

Cheers Esteban

gavinsimpson commented 3 hours ago

And then you open an issue that requires more work for us to deal with. This should have been a discussion topic - and one of us will convert to such shortly.

Some of what Marti writes in those posts is just wrong, due to either a misunderstanding of how to use restricted permutations in vegan or intentionally using it wrongly to make differences seem greater than they are.

We never claimed to handle nested designs, so that's Marti's first error. As such we do get the wrong denominator when forming our pseudo F statistic when users try to do a nested analysis. We're still trying to figure out how much of an issue this is - the one example purporting to show the problem in a reproducible way was actually a problem with floating-point arithmetic and small differences in certain operations between what vegan does internally and the by-hand code that purported to show an issue.

To handle random factors would require additional work; R's formula interface doesn't allow for identification of random versus fixed strata. We'd have to implement something like aov()'s Error() formula special to allow random factors of the sort Marti uses in the example. Or implement (1 | random_factor) a la lme4 / glmmTMB.

As for permutation tests; we do handle these correctly, but we require the user to specify how they want to permute the data. I don't have Primer, but it seems as if this software can deduce from the formula that restricting permutations to be within levels of the random factors is needed. If someone wants to pay Jari or I to write this code, we could come up with something similar. In lieu of that, we ask the user to define how they want to permute the data and hence we require more of the user than Primer does. That doesn't seem to bad a tradeoff given someone is using R anyway, which usually requires more from a user than more user-friendly stats software. This is the bit Marti gets wrong (intentionally or unintentionally, shrug).