Closed 27359794 closed 11 years ago
Thanks for reporting. +1 for adding a PR for this
The fix for this was simple enough - just get the covariance matrix, check the relevant elements, and calculate the coefficient from that if necessary. However, I'm new to github and may have the pull request procedure all wrong.
Thanks for the pr !
Canyou add a test?
I piggybacked off the test that makes sure that 'NaN' gets converted to zero. It's still going to produce a warning if the arrays are length one, but I didn't want to introduce a length check. I made it so the warning would produce an error in the nosetest. These are all guesses and I welcome feedback.
Seem to be fix in 6dfaa7ae254ae6228f1fc4d9182e70d8442476c8
This issue has reappeared with some later rewrites. Is multilabel classification support the cause of this? Reproducible snippet below (running v0.22):
>>> import sklearn.metrics
>>> trues = [1,0,1,1,0]
>>> preds = [0,0,0,0,0]
>>> sklearn.metrics.matthews_corrcoef(trues, preds)
C:\anaconda\lib\site-packages\sklearn\metrics\_classification.py:896: RuntimeWarning: invalid value encountered in double_scalars
mcc = cov_ytyp / np.sqrt(cov_ytyt * cov_ypyp)
0.0
If this is unintended, I will be happy to issue a PR to reintroduce the above behavior (testing for zero denominator instead of the NaN result).
I've also seen this happen recently – it most probably shouldn't.
I'm getting similar warnings being thrown. Should this issue be reopened?
Please open a new issue referring to this one, with a runnable code snippet demonstrating the issue. Thanks
2022 and still getting this warning: RuntimeWarning: invalid value encountered in double_scalars mcc = cov_ytyp / np.sqrt(cov_ytyt * cov_ypyp)
The formula for the Matthews correlation coefficient metric involves a division. In certain cases, the denominator of this division can be 0. In this situation, one of numpy's functions called by metrics.matthews_corrcoef throws a warning:
However, as Wikipedia states on the page for the metric, "If any of the four sums in the denominator is zero, the denominator can be arbitrarily set to one; this results in a Matthews correlation coefficient of zero, which can be shown to be the correct limiting value."
I think metrics.matthews_corrcoef should detect if the denominator will be 0 (this is a trivial property to check), and if so, set it to 1, instead of triggering a runtime warning and returning the right value (0) anyway.