h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.85k stars 2k forks source link

Issue with MCC metric computation #16329

Open adallak opened 1 month ago

adallak commented 1 month ago

I am reading the documentation for the MCC metric available at https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/metrics.html and it says that "The absolute MCC (a value between 0 and 1, 0 being totally dissimilar, 1 being identical)." However, MCC is a value between -1 and 1. The value MCC = 0 means that the algorithm behaves like a random guess, and MCC = -1 means that the algorithm predicts labels completely wrong. I tested and it seems H2O always reports the absolute value of MCC, which in my opinion is incorrect. Please verify that the reported MCC is correct. Thank you.

wendycwong commented 1 month ago

@adallak : You are correct. Here is a link to the definition: https://www.voxco.com/blog/matthewss-correlation-coefficient-definition-formula-and-advantages/

maurever commented 4 weeks ago

Hi @adallak. Our documentation shows a mismatch between MCC and Absolute MCC metrics. We need to clarify the API and the doc as well. Thanks for catching this!