Add support for model checkpointing of the Stacked Ensemble metalearner

h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

http://h2o.ai

Apache License 2.0

6.92k stars 2k forks source link

Add support for model checkpointing of the Stacked Ensemble metalearner #8086

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

We should add a checkpointing parameter to the Stacked Ensemble function. If you train a Stacked Ensemble with metalearner_algorithm = “GBM” or something that supports checkpointing, we should be able to start-stop the training (feature request from folks who are working on streaming/online models with H2O).

Grab the metalearner model object from the SE object and use it’s checkpointing functionality to re-start training. Then copy the updated metalearner model & metrics into all the right places (including top-level metrics for SE model).

exalate-issue-sync[bot] commented 1 year ago

Tomas Fryda commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c], am I correct to assume this should work only for blending, i.e., not for CV?

Or for CV, I think I could also check that the user is training with the same data as the checkpointed model was and then just use the stored CV predictions, i.e., the user would not be able to provide more data but (s)he could improve the metalearner, e.g., by adding more trees (trained on same data as before). If the user specifies different frame, I could throw an exception.

What do you think? Or is there some better approach?

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7552 Assignee: Tomas Fryda Reporter: Erin LeDell State: Open Fix Version: Backlog Attachments: N/A Development PRs: Available

Linked PRs from JIRA

https://github.com/h2oai/h2o-3/pull/4645