h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.85k stars 2k forks source link

AutoML: GLM ignores early stopping #7243

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

GLM model in AutoML uses {{lambda_search}} and therefore ignores the default {{early_stopping}} parameters.

However, when lambda search is enabled, we can still control the convergence rate using parameters like {{objective_epsilon}} ([https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/objective_epsilon.html|https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/objective_epsilon.html|smart-link]), so we should be able to find a correspondance between the {{stopping_tolerance}} and {{objective_epsilon}} that would allow us to set it automatically in AutoML.

Given the semantic of those 2 parameters and how they seem to be used internally, I would suggest to use

{noformat}objective_epsilon = stopping_tolerance{noformat}

ignoring the default value.

The same logic probably needs to be applied in Stacked Ensemble AUTO metalearner (and maybe GLM?).

h2o-ops-ro commented 1 year ago

JIRA Issue Details

Jira Issue: PUBDEV-8417 Assignee: Sebastien Poirier Reporter: Sebastien Poirier State: Open Fix Version: Backlog Attachments: N/A Development PRs: Available

h2o-ops-ro commented 1 year ago

Linked PRs from JIRA

https://github.com/h2oai/h2o-3/pull/5951