H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Although other learning algorithms implemented in H2O already offer MCC metric, such as GLM, the MCC metric is not currently available for the deep learning algorithm in H2O. For highly unbalanced, e.g., rare positive, classification outcomes such as often seen in disease detection and machine failure and other anomaly detection applications, a suitable metric such as MCC is critical to use to control the learning algorithm. It is not uncommon to see a 99.5% accuracy for the trivial base rate prediction of negative classes in some applications. Rebalancing the data and using metrics suited to roughly equally balanced outcomes is often not good enough to effectively train models.
Although other learning algorithms implemented in H2O already offer MCC metric, such as GLM, the MCC metric is not currently available for the deep learning algorithm in H2O. For highly unbalanced, e.g., rare positive, classification outcomes such as often seen in disease detection and machine failure and other anomaly detection applications, a suitable metric such as MCC is critical to use to control the learning algorithm. It is not uncommon to see a 99.5% accuracy for the trivial base rate prediction of negative classes in some applications. Rebalancing the data and using metrics suited to roughly equally balanced outcomes is often not good enough to effectively train models.