I am the other of PyIK, a toolbox library for data analysis in high energy physics. PyIK is nice and working well, but I have no time to raise awareness of the library, so I am trying to move parts of it to other libraries with the potential for getting more attention.
Your bootstrap library is the first hit on PyPI, so I came here. I have some bootstrap tools which I would like to contribute. I have seen that your basic bootstrap tool only does the basic bootstrap, not the balanced bootstrap, which improves the stability. I have an implementation of the balanced bootstrap here:
Note how I never pass the indices to the user and do all the resampling of the input data internally. The user just provides a function that processes the input and generates the output. The function then returns the output from the resampling. This works very nicely with statistical functions in numpy, e.g.
# bootstrapping the error of the mean
error_on_mean = np.std(bootstrap(np.mean, [1, 2, 3, 4]))
# bootstraping the error of the standard deviation
error_on_stddev = np.std(bootstrap(np.std, [1, 2, 3, 4]))
Here is a convenience tools that could go into your library:
Hi,
I am the other of PyIK, a toolbox library for data analysis in high energy physics. PyIK is nice and working well, but I have no time to raise awareness of the library, so I am trying to move parts of it to other libraries with the potential for getting more attention.
Your bootstrap library is the first hit on PyPI, so I came here. I have some bootstrap tools which I would like to contribute. I have seen that your basic bootstrap tool only does the basic bootstrap, not the balanced bootstrap, which improves the stability. I have an implementation of the balanced bootstrap here:
bootstrap https://github.com/HDembinski/pyik/blob/217ae25bbc316c7a209a1a4a1ce084f6ca34276b/pyik/numpyext.py#L830
Note how I never pass the indices to the user and do all the resampling of the input data internally. The user just provides a function that processes the input and generates the output. The function then returns the output from the resampling. This works very nicely with statistical functions in numpy, e.g.
Here is a convenience tools that could go into your library:
bootstrap_covariance https://github.com/HDembinski/pyik/blob/217ae25bbc316c7a209a1a4a1ce084f6ca34276b/pyik/numpyext.py#L910