vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
436 stars 95 forks source link

Gini coefficient #419

Open lucygarner opened 3 years ago

lucygarner commented 3 years ago

Hi,

I was wondering whether there is a function inside vegan for calculating the Gini coefficient, as this is another diversity metric that is used a lot for TCR/BCR repertoire analysis?

Many thanks, Lucy

jarioksa commented 3 years ago

It is called Simpson index in ecology, and it is available as diversity(..., index="simpson").

lucygarner commented 3 years ago

Thank you. I found the Gini function within the DescTools package. However this gives me very different scores to what I get with the Simpson index in vegan, so I assume they must be calculating something different.

Could I clarify that I have got the file format correct for the diversity function please? I have the different TCR clonotypes as columns of my matrix and then my donors as rows of the matrix. The matrix is then filled with counts of the number of cells from each donor that have a particular clonotype.

psolymos commented 3 years ago

The diversity() function in vegan calculates the empirical probability of randomly choosing 2 individuals from the sample being different species: 1-sum(p^2) where p=x/sum(x).

> library(vegan)
> x1=c(0,1,2,5)
> x2=c(2,2,2,2)
> 
> diversity(rbind(x1,x2), index="simpson")
     x1      x2 
0.53125 0.75000 
> 
> 1-sum((x1/sum(x1))^2)
[1] 0.53125
> 1-sum((x2/sum(x2))^2)
[1] 0.75

The Gini function in DescTools calculates the Gini index that is an inequality measure (and not the same as the Gini-Simpson index above, although they are conceptually related). Gini gives 0 for perfect equality as you can see below:

> Gini(x1)
[1] 0.6666667
> Gini(x2)
[1] 0

?Gini explains what that function calculates, which is not the same as the diversity metric vegan calculates.

It is up to you to decide what index you need for your scientific question.