h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.86k stars 1.99k forks source link

MCC metric for deep learning #10373

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Although other learning algorithms implemented in H2O already offer MCC metric, such as GLM, the MCC metric is not currently available for the deep learning algorithm in H2O. For highly unbalanced, e.g., rare positive, classification outcomes such as often seen in disease detection and machine failure and other anomaly detection applications, a suitable metric such as MCC is critical to use to control the learning algorithm. It is not uncommon to see a 99.5% accuracy for the trivial base rate prediction of negative classes in some applications. Rebalancing the data and using metrics suited to roughly equally balanced outcomes is often not good enough to effectively train models.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-3462 Assignee: New H2O Bugs Reporter: Geoffrey Anderson State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A