hill_func() is returning D_q higher than species richness

cassioalencarnunes commented 2 years ago

I noticed that hill_func() is returning values of D_q (functional hill number, the effective number of equally abundant and functionally equally distinct species) higher than the values of taxonomic species richness, which should not be possible (it is not possible to have more equally abundant and functionally equally distinct species than the number of species). I was afraid it was something wrong with my data, but I checked the example codes with "dummy" dataset from FD package and it happens too.

library(FD)
library(hillR)

dummy = FD::dummy

hill_func(comm = dummy$abun, traits = dummy$trait, q = 0)
hill_taxa(comm = dummy$abun, q = 0)

And the outputs are:

          com1      com2      com3      com4      com5      com6      com7      com8       com9
Q    0.4016663 0.1922618 0.2780442 0.1146261 0.3816159  0.404177 0.2934143 0.3343662  0.4156546
FDis 0.3481687 0.1670560 0.2375808 0.1146261 0.3211366  0.330233 0.2532751 0.2877931  0.3421687
D_q  4.0974923 3.6518111 3.2454591 3.0237158 3.0655375  5.233241 3.1470056 4.3998540  5.2114653
MD_q 1.6458245 0.7021037 0.9023810 0.3465969 1.1698580  2.115156 0.9233765 1.4711625  2.1661695
FD_q 6.7437533 2.5639502 2.9286406 1.0480104 3.5862436 11.069121 2.9058708 6.4729004 11.2889174
         com10
Q    0.3844765
FDis 0.3503927
D_q  4.1694097
MD_q 1.6030400
FD_q 6.6837303

 com1  com2  com3  com4  com5  com6  com7  com8  com9 com10 
    4     3     3     2     3     5     3     4     5     4

Every community here has D_q higher than species richness for q = 0. Something seems wrong here, but I could not find the problem in the codes. Can you help with this?

daijiang commented 2 years ago

I think there is no mistake in the code. If all species pairs are equally distinct, then D_q should equal to ordinary Hill number. But in real world, this won't happen, and D_q can be different from the ordinary Hill number.

cassioalencarnunes commented 2 years ago

Thanks for your answer!

However, neither in my data or in the example data the species pairs are equally distinct. In fact, they are functionally close. Therefore, D_q value should not be close to ordinary Hill number, i.e., there are less equally abundant, functionally equally distinct number of species than the actual number of species.

In addition, even if the distance between all pairs were 1 (the maximum value when distance is standardised), D_q should equal ordinary Hill number following the framework proposed by Chao et al 2014 (Unifying Species Diversity, Phylogenetic Diversity, Functional Diversity, and Related Similarity and Differentiation Measures Through Hill Numbers), right?

daijiang commented 2 years ago

I think I noticed what is the problem here. When q = 0, the current code still calculates the quadratic entropy Q using the $p_ip_j$ in the community data frame. This is now fixed so that when all species are equally abundant, then D_q = S no matter what $d_{ij}$ are (see page 5 case c of the origin paper).

daijiang commented 2 years ago

commit a3ea071132595387e300b261cff18348f052b703

cassioalencarnunes commented 2 years ago

Alright, thank you again for your response.

I think you found one problem, indeed. However, I do not understand why when all species are equally abundant D_q = S. This would only be true if all species were equally functionally distinct, doesn't it? In Table 1 of the theoretical paper it is clear that dij should be considered even when q = 0, otherwise, for q = 0 functional diversity would be the same as taxonomic diversity and it would not make sense.

In addition, the problem of higher D_q than taxonomic diversity is still present when I set q > 0 in hill_funct(), which should also not happen for the same reason as for q = 0: due to functional redundancy, functional Hill number should not be greater than taxonomic Hill numbers.

daijiang commented 2 years ago

Please read the Chiu and Chao 2014 Plos One paper for the equations. That's just how the functional hill numbers developed; I simply just translated their equations here. And based on equation 3, the D_q can be larger than S.

daijiang / hillR

hill_func() is returning D_q higher than species richness #19