grunwaldlab / poppr

🌶 An R package for genetic analysis of populations with mixed (clonal/sexual) reproduction
https://grunwaldlab.github.io/poppr
68 stars 26 forks source link

Simpson index #194

Open grunwald opened 6 years ago

grunwald commented 6 years ago

Please place an "x" in all the boxes that apply


The Simpson index lambda as calculated is actually 1 - lambda (not lambda). This should be clarified in the vignette and user manuals. I suggest keeping 1 - lambda because when lambda = 0 this represents infinite diversity and lambda = 1 is no diversity. With 1 – lambda, 0 represents no diversity and 1 represents maximal diversity, which is more intuitive.

grunwald commented 6 years ago

Michael suggested a revision in wording as follows:

“I suggest keeping 1 - lambda because when lambda = 0 represents infinite diversity and lambda = 1 is no diversity. With 1 – lambda, 0 represents no diversity and 1 represents maximal diversity, which is more intuitive.”

zkamvar commented 6 years ago

So, to document this a bit further:

The problem is that poppr() and diversity_stats() return a value called "lambda" as part of their output.

What poppr is returning is actually 1-lambda, which happens to be Simpson's Index of Diversity (as opposed to lambda, which is Simpson's Index). As a bit of background, poppr uses vegan::diversity() for Simpson's diversity index (0). I got confused, assuming that Simpson's Index == Simpson's Diversity Index (which, incidentally, is a pretty common mistake (1)), ignoring the fact that Inverse Simpson's didn't make sense if using Simpson's diversity index as lambda.

the question becomes how to address it. I can do one of two things:

As both Michael and Nik pointed out, changing the calculation to match the variable name doesn't make sense because lambda itself doesn't make sense in terms of diversity, so we will update the name of the output and revising the documentation.

Funny enough, the function locus_table() correctly returns Simpson's diversity index labeled as 1-D.

Now, I just need to think about how to name it. Anything I name it will break backwards compatibility, but naming it something like 1-lambda is awkward since it makes things difficult to subset.

zkamvar commented 6 years ago

For reference, here are all the instances of "lambda" in the code:

https://github.com/grunwaldlab/poppr/search?l=R&q=lambda