diazrenata / isds

Analysis of individual size distributions (of mammals)
MIT License
0 stars 0 forks source link

Low abundance species #12

Open diazrenata opened 4 years ago

diazrenata commented 4 years ago

A KDE (or anything) constructed based on observations of a few individuals will be much pointier than one constructed based on more observations. This can lead to counterintuitive overlap outcomes: even if you pull all individuals from the same normal distribution, the low-abundance speices can end up having quite low overlap compared to the more abundant ones.

You could try upsampling/bootstrapping all the species? Drop a normal distribution centered(?) on all species means, assign sds either from observation or by a rule, and then sample an equal large nb of individuals for all the species. The trick there is there's also less certainty re the true mean for low abundance species, so you want some way of assigning a "prior" or weighting based on nb observations. And doing it a bunch of times.

There are some philosophical arguments around this. On the one had it's meaningful if there are species in the gaps but only with very low N. Also, maybe low abundance species also have less intraspecific variation in size?

I think you want to weight it on the back end too. So even if you bootstrap to get the overlap values for every pair, re-weight according to the number of individuals represented when you pull back to look at the distribution of overlap scores for the whole community.

diazrenata commented 4 years ago

I think there is a way to find the probability distribution for the mean given a single or a few observations and assuming a sd that scales with the mean with some coefficient. But I'm not sure that it's going to be a lot better than saying it's probably not farther than +/- [coefficient] [observation value], or drawing it from another normal distribution centered on the observation with sd = coefficient obs.