h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.85k stars 1.99k forks source link

AutoML: expose sorting and stopping metrics as an enum for Java/Scala API #8931

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

In AutoML, sorting_metric and stopping_metric are currently exposed as a string with values "unfortunately" slightly different from the metrics defined in ScoreKeeper.StoppingMetric. On its side and with good reasons, Sparkling Water would like to use an enum to consume the API manipulating those metrics.

We need to find a way to expose an enum but keeping backwards compatibility: it will probably necessary to have a new enum only at API level, but we should be able to use ScoreKeeper.StoppingMetric enum for all logic in Leaderboard especially.

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c] This is a request from Sparkling Water, and I believe it would benefit to API consistency and maintainability.

I was about to restrict current {{sort_metric}} to metrics defined in {{ScoreKeeper.StoppingMetric}} until I realized that technically, for binomial classification problems, leaderboard could also support metrics like {{accuracy}}, {{f1}}, {{f2}}, {{precision}}, {{recall}}, and so on… defined in AUC2. This is not documented though, and the sorting direction is currently wrong for those AUC2 metrics (but this could be easily fixed). Is it something we want to support and mention in documentation?

If so, I would have to create another abstraction on top of the 2 existing enums{{ScoreKeeper.StoppingMetric}} and {{AUC2.ThresholdCriterion}} to be able to expose them as one single {{enum}} or {{interface}}.

Otherwise, I will simply expose {{sort_metric}} as a {{StoppingMetric}}.

Any thoughts/concerns?

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-6701 Assignee: Sebastien Poirier Reporter: Sebastien Poirier State: Open Fix Version: N/A Attachments: N/A Development PRs: Available

Linked PRs from JIRA

https://github.com/h2oai/h2o-3/pull/3735