vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
449 stars 97 forks source link

Adding Yue-Clayton index #525

Closed antagomir closed 2 years ago

antagomir commented 2 years ago

I suggest to add Yue-Clayton (dis)similarity index (Yue & Clayton (2005)) as a new vegan::vegdist option. This is used in microbial ecology at least.

I already drafted the first version and did preliminary testing. This would complement the Aitchison and robust Aitchison distances that we contributed to vegan earlier this year.

If Yue-Clayton contribution is welcome to the vegan package, I could open a PR?

jarioksa commented 2 years ago

The "new" Yue & Clanton index (their equation 2.1, first form) can be written in vegan as designdist(x, "J/(A+B-J)", terms="quadratic"). This is a similarity index and it is identical to a similarity ratio in David Wishart's CLUSTAN software at least since Clustan 1A manual 1969. It was very popular in vegetation ecology in the 1980s, mainly because it was advocated by Eddy van der Maarel (chief editor of Vegetatio and later of Journal of Vegetation Science) as the similarity ratio. It is a similarity index, but easily transformed to a dissimilarity as designdist(x,"(A+B-2*J)/(A+B-J)", terms="quadratic"). For binary data, it indeed reduces to Jaccard. Vegan has now a "quantitative Jaccard" in vegdist using minimum terms instead of quadratic, or designdist(x, "(A+B-2*J)/(A+B-J)", terms = "minimum"). I knew quite well van der Maarel's similarity ratio, but did not want to implement it because quadratic terms emphasize single large differences and are therefore suboptimal to most community data sets, and opted instead for the current quantitative "jaccard" with minimum terms. However, I had used it very often in 1980s, and learnt about it in a Nordic ecology course in Lund in 1979 (Eddy van der Maarel and ingenious Dick Clymo). It would be very easy to implement this method in vegdist but I do think this is worse than current method = "jaccard" (with minimum terms), and I'm also a bit reluctant to implement methods that can be easily expressed with designdist.

I find it elating that methods from 1969 (53 years ago) are suggested as new. Yue & Clayton paper is from 2005, so they were repeating Wishart only 36 years later. In general, I think that if you suggest a simple (dis)similarity index that uses terms Σp2, Σq2 and Σpq, do not assume that it is something new that was never invented. I think the reason Wishart called this a similarity ratio was that he thought that this was not novel in 1969 but invented many time before.

Not a very enthusiastic response, but I'm open to diverging opinions, and can change my mind. The next vegan release will be out "really soon now", so decisions must be reached out soon. Vegan would have been released already, but I have been busy with another package (Hmsc) where they pay for my work (unlike for vegan).

See also microbiome/mia#305.

antagomir commented 2 years ago

I think this is interesting and nails it down well. I wonder if they had new interpretations in the 2005 paper, although it was pretty short anyway.

Perhaps this issue can be closed until it can be argued that something else than designdist would be essential. Thanks!

jarioksa commented 2 years ago

The paper has a good and thorough discussion of statistical properties of the proposed index. I think this could be serve as a model of similar analysis of several other indices with quadratic terms.