Add bootstrapping functionality

h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Apache License 2.0

6.92k stars 2k forks source link

Add the ability to bootstrap on an H2O frame so that one could train a model on a sample with replacement and validate on a sample with replacement repeatedly. This would allow users to calculate standard error on performance metrics. A description of this process can be found here: https://machinelearningmastery.com/calculate-bootstrap-confidence-intervals-machine-learning-results-python/

It would be nice to have the ability to not only perform this bootstrapping on metrics automatically calculated by H2O-3, but also be able to do this on custom metrics. I think this could be possible by allowing the user access to the sampled validation frame. They can then apply their custom validation function on the frame.

h2oai / h2o-3

Add bootstrapping functionality #9349