Open isaacovercast opened 1 year ago
Indeed, for the genetic diversity hill #s if there are any '0' in the vector of raw pi values within a given timestep then this fails because of log(0)
:
hill[q == 1] <- exp(sum(-n * log(n)))
'q' values other than 1 are perfectly happy with zeros in the pi vector.
In MESS I just ignore pi==0 values and I think this is justifiable. This is also how scipy.stats.entropy
treats zeros.
Update: I pushed a fix for the genetic diversity Hill's, simply removing 0s from the vector.
For trait hills, you get 'NaN' in the case where there is only 1 species present in the local community. If there's only 1 species present then in .hillDivTrait()
the distance matrix between traits (dij <- as.matrix(dist(traits))
) is a single value == 0, which eventually leads to a divide-by-zero 2 lines later.
After wrestling (philosophically) with whether it makes sense to try to hatch a trait diversity for when there is a single species, or whether to simply build in a catch to send single-species communities to evaluate to 0, I'm leaning towards just building in the catch sending single-species to evaluate to zero.
Considerations:
Amending: I think it should evaluate to 1, yes? Weighting by trait diversity is moot in this instance, so it should converge back to abundance (which evaluates to 1 anyway).
See b0ddd3f3339a12726d11bc59c04cdf9bb041e7b8. I feel like there's room for spirited discussion re: the interpretation of trait hill numbers in a single-species community, though! :P
agreed. In the case of no trait variation it should be 1.
Getting sneakier, b1ba083f7e1e9326420eaccd581322718ce6d6a3. This one catches both edge cases: if there is only one species, or if there are multiple species all with the same trait. (Seems extreme but possible in special circumstances, like a super narrow optimum). In this instance, it effectively reverts to abundance hill numbers.
I could be convinced in this instance (no trait, but species, variation) it should be something else, though. 1 for that, too?
(hillR::hill_func just fails in this instance, btw)
IDK, I think conceptuall if there is literally zero trait variation then I would say it should be 1. If you think about it from the perspective of # of 'effective species traits' then it would be 1.
This is a super hard one to figure out!
After perhaps too much effort, I agree that for "effective number of species" (i.e. weighted Hill) the answer should be 1, both from the perspective of 1 species, or multiple species but with no trait variation.
I went deep down a rabbit hole on this (nothing against rabbits!)
The short answer is that unweighted hill goes to 0 both for 1 species and multiple species with no trait variance. (Script below showing that).
But weighted hill is $(D{hill} / Q)^{1/2}$. And Q in the equation goes to 0 just as fast as $D{hill}$ (where $D_{hill}$ is unweighted hill). So again weighted Hill in the limit of 0 trait variation (either from 1 species or multiple spp with same trait) goes to 1. I'm not showing the math for Q going to 0, but it's fairly clear from Chao's equations
Here's the script
# a function to explore limits of trait hill number
# sets q = 2 w/o loss of generality
# n = vector of spp abundances
# trtDiag = intra-specific trait difference (in reality always 0, but we allow
# it to be != 0 to explore limit)
# trtOff = inter-specific trait differences (i.e. "off diagnal" entries in a
# distance matrix)
hillLim <- function(n, trtDiag, trtOff) {
p <- n / sum(n)
dij <- matrix(trtOff, nrow = length(p), ncol = length(p))
diag(dij) <- trtDiag
Q <- as.vector(p %*% dij %*% p)
a <- outer(p, p, '*') / Q
return(sum(dij * a^2)^(1 / (1 - 2)))
}
# explore limit as inter-specific differences go to 0
hillLim(rep(1, 5), trtDiag = 0, trtOff = 1)
## 20
hillLim(rep(1, 5), trtDiag = 0, trtOff = 0.1)
## 2
hillLim(rep(1, 5), trtDiag = 0, trtOff = 0.01)
## 0.2
hillLim(rep(1, 5), trtDiag = 0, trtOff = 0.001)
## 0.02
So this says that as trait variation between species goes to 0, so too does Hill diversity go to 0. But again, this is for unweighted hill.
Now for a single species:
# explore limit as diagonal element approaches 0 when there is only 1 species
# (i.e. figure out what hill div should be for only 1 species)
hillLim(1, trtDiag = 0.1, trtOff = 1)
## 0.1
hillLim(1, trtDiag = 0.01, trtOff = 1)
## 0.01
hillLim(1, trtDiag = 0.001, trtOff = 1)
## 0.001
Here again, for a single species Hill diversity goes to 0.
But again these calculations are for unweighted Hill. For weighted Hill, I convinced myself the answer is indeed equal to 1 because D_hill and Q go to 0 at the same rate.
OH! Something important though that's coming out of this is: right now we're returning unweighted Hill numbers right? So is that what we want to do, or do we want to switch to returning weighted Hills?
Damn! I been working on this more and I was wrong about weighted trait Hill for the case no variation: weighted trait Hill for the case with no variation might equal weighted abundance-based Hill with the same SAD...still working on it, stay tuned, or if you all have it figured out, even better!
One thing for sure though we have to pick a column (attribute diversity or generalized hill):
Generalized Hills, that's my vote. The interpretation is very natural.
On Thu, Jun 1, 2023 at 2:31 PM Andy Rominger @.***> wrote:
One thing for sure though we have to pick a column (attribute diversity or generalized hill): [image: image] https://user-images.githubusercontent.com/2481785/242692425-0791c9a8-9045-43a5-a48e-ee408441f2cf.png
— Reply to this email directly, view it on GitHub https://github.com/role-model/roleR/issues/121#issuecomment-1572581459, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNSXP3UGFHNXNT2V5RXHBLXJDNYDANCNFSM6AAAAAAYK2NKVA . You are receiving this because you authored the thread.Message ID: @.***>
I agree with Isaac about generalized hills.
also agree with both of you about using generalized hills!
so this means we need to revise the trait and phylo hill numbers to return the generalized value (currently they return the "attribute diversity" from Chao's table). the abundance hill function (and thus the gen div hill) returns the generalized value because the generalized and attribute div for abundance are equivalent
and my final thought on hill numbers when trait variation is absolutely 0 is that the generalized number should indeed be 1. if you all are good to go with that, let's go for it
Unless I'm mistaken, the trait hill number function is already returning generalized hills! e4b7a4701fa52e80697fa8a8a750de8fd9cbfa99 implements a check if the trait diversity is 0 (which would occur if species richness is 0, or if trait diversity is 0) and returns 1 for all q.
I'll work on phylo next!
right you are about generalized hills! don't know how i convinced myself otherwise!
Run this:
The this
getSumStats(model)
and some genetic diversity and trait Hill's for some timesteps will be 'NaN'. It is non-deterministic, and we assume it's because perhaps there is no variation in these values at a given timestep?