AutoML: Investigate changing the default stopping metric from logloss to the sort_metric in classification

h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Apache License 2.0

6.88k stars 1.99k forks source link

We should consider running a benchmark to see if the complexity of having different early stopping vs sort metrics is useful by default. My main concern is that the default 0.001 stopping tolerance might be "tuned" to work well with logloss, but might behave unpredictably with various other metrics (auc, etc).

We should also take a look at whether it stops later if we change it. The tree based models in AutoML seem to stop pretty early, generally. So that's a related question to look at simultaneously.

h2oai / h2o-3

AutoML: Investigate changing the default stopping metric from logloss to the sort_metric in classification #7131