cgevans / scikits-bootstrap

Python/numpy bootstrap confidence interval estimation.
Other
174 stars 36 forks source link

Balanced bootstrap, more convenience functions #22

Open HDembinski opened 5 years ago

HDembinski commented 5 years ago

Hi,

I am the other of PyIK, a toolbox library for data analysis in high energy physics. PyIK is nice and working well, but I have no time to raise awareness of the library, so I am trying to move parts of it to other libraries with the potential for getting more attention.

Your bootstrap library is the first hit on PyPI, so I came here. I have some bootstrap tools which I would like to contribute. I have seen that your basic bootstrap tool only does the basic bootstrap, not the balanced bootstrap, which improves the stability. I have an implementation of the balanced bootstrap here:

bootstrap https://github.com/HDembinski/pyik/blob/217ae25bbc316c7a209a1a4a1ce084f6ca34276b/pyik/numpyext.py#L830

Note how I never pass the indices to the user and do all the resampling of the input data internally. The user just provides a function that processes the input and generates the output. The function then returns the output from the resampling. This works very nicely with statistical functions in numpy, e.g.

# bootstrapping the error of the mean
error_on_mean = np.std(bootstrap(np.mean, [1, 2, 3, 4]))

# bootstraping the error of the standard deviation
error_on_stddev = np.std(bootstrap(np.std, [1, 2, 3, 4]))

Here is a convenience tools that could go into your library:

bootstrap_covariance https://github.com/HDembinski/pyik/blob/217ae25bbc316c7a209a1a4a1ce084f6ca34276b/pyik/numpyext.py#L910