h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Grid search for Stacked Ensemble #12153

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Here is the initial design idea for the Stacked Ensemble grid search (which is mostly a search of the metalearner hyperparameters, but could also include other params like metalearner_nfolds).

Python example: {code} metalearner_grid_params_gbm = {'max_depth': [2,3,4], 'col_sample_rate': [0.2,0.5,0.7]} metalearner_grid_params_rf = {'ntrees': [200,300,400], 'col_sample_rate': [0.2,0.5,0.7]}

set up SE grid, use hyper_params to pass a new value called metalearner_params

grid = H2OGridSearch(model=H2OStackedEnsembleEstimator, hyper_params={'metalearner_grid_params': [{'algorithm': "GBM", 'params': metalearner_grid_params_gbm}, {'algorithm': "DRF", 'params': metalearner_grid_params_rf}]}, seed=1, search_criteria={'strategy': 'RandomDiscrete', 'max_models': 36})

grid.train(x=x, y=y, training_frame=train, seed=1, #this is SE seed (not grid seed) base_models=[my_gbm, my_rf]) #pass along fixed SE params like base_models

Single model (for comparison)

metalearner_gbm_params = {'max_depth': 2, 'col_sample_rate': 0.3} ensemble = H2OStackedEnsembleEstimator(base_models=[my_gbm, my_rf], metalearner_algorithm="GBM", metalearner_params=metalearner_gbm_params) ensemble.train(x=x, y=y, training_frame=train) {code}

exalate-issue-sync[bot] commented 1 year ago

Michal Kurka commented: Paused work for now (will get back to it after the fix release)

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: Should we put this back in the pipe?

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5281 Assignee: Michal Kurka Reporter: Erin LeDell State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A