Low abundance species - Githubissues

A KDE (or anything) constructed based on observations of a few individuals will be much pointier than one constructed based on more observations. This can lead to counterintuitive overlap outcomes: even if you pull all individuals from the same normal distribution, the low-abundance speices can end up having quite low overlap compared to the more abundant ones.

You could try upsampling/bootstrapping all the species? Drop a normal distribution centered(?) on all species means, assign sds either from observation or by a rule, and then sample an equal large nb of individuals for all the species. The trick there is there's also less certainty re the true mean for low abundance species, so you want some way of assigning a "prior" or weighting based on nb observations. And doing it a bunch of times.

There are some philosophical arguments around this. On the one had it's meaningful if there are species in the gaps but only with very low N. Also, maybe low abundance species also have less intraspecific variation in size?

I think you want to weight it on the back end too. So even if you bootstrap to get the overlap values for every pair, re-weight according to the number of individuals represented when you pull back to look at the distribution of overlap scores for the whole community.

diazrenata / isds

Low abundance species #12