shawnlaffan / biodiverse

A tool for the spatial analysis of diversity
http://shawnlaffan.github.io/biodiverse/
GNU General Public License v3.0
75 stars 19 forks source link

Randomisations - add subsampling option #222

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
A useful randomisation, which is essentially a cross-validation approach, is to 
randomly delete some subset of labels from the cloned basedata, and then assess 
how stable the analysis results are.

Original issue reported on code.google.com by shawnlaffan on 1 Mar 2011 at 1:38

GoogleCodeExporter commented 9 years ago
Thanks, Shawn!

Original comment by amb...@gmail.com on 1 Mar 2011 at 3:05

shawnlaffan commented 5 years ago

This could be done using a multinomial sampler approach, probably called in _get_randomised_basedata to generate a new basedata to pass on to the randomisation function. That way we can subsample and then apply shuffling if needed.

An optimisation for rand_nochange is to check if we are using the subsampled copy and return it instead of cloning another copy.