mcglinnlab / shark-ray-div

Global patterns of shark and ray diversity
0 stars 1 forks source link

beta statistic instability #19

Closed dmcglinn closed 2 years ago

dmcglinn commented 6 years ago

Hey @EmmalineSheahan I played around with the beta statistic estimator, and it is definitely sensitive to smaller trees. Here are some quick simulation results

library(apTreeshape)
# simulate yule tree which should have beta of 0
b100 = sapply(rtreeshape(n=50, tip.number=100, model="yule"),  maxlik.betasplit)
b10 = sapply(rtreeshape(n=50, tip.number=10, model="yule"),  maxlik.betasplit)
b5 = sapply(rtreeshape(n=50, tip.number=5, model="yule"),  maxlik.betasplit)

boxplot(unlist(b100), unlist(b10), unlist(b5), names = c('100sp', '50sp', '5sp'),
        ylab='beta')

here is the resulting boxplot

plot_zoom_png

EmmalineSheahan commented 6 years ago

Yeah that looks pretty significantly varied, should I run the beta code again but this time set a minimum number of taxa the tree has to have in order to get a value, or should we just mention in the paper that beta is heavily influenced by how many species are in the tree? I think the greatest number of species in any cell at the finest resolution is 69.

dmcglinn commented 6 years ago

I think some kind of minimum richness is likely going to be needed - it also suggests that maybe our grid resolution doesn't really make sense for the beta-statistic - maybe we should only calculate it once at the bio-geographic province level.

If we want to look more closely at the spatial patterns then I propose that we also calculate beta's confidence interval for each cell and then use that as an inverse weighting system if beta is included in any regressions - on a map the weighting system could be illustrated by how grey the cell is.