meetthakkar88 / accord

Automatically exported from code.google.com/p/accord
0 stars 0 forks source link

MatthewsCorrelationCoefficient is incorrect #77

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The MatthewsCorrelationCoefficient property of the ConfusionMatrix class seems 
to be incorrect. In Accord, the numerator of the fraction is currently:

(truePositives * trueNegatives) 

According to Wikipedia 
(http://en.wikipedia.org/wiki/Matthews_correlation_coefficient) the numerator 
should be:

(truePositives * trueNegatives) - (falsePositives * falseNegatives)

There's also an an integer overflow bug here. For very large numbers of 
true/false positives/negatives, the property will return NaN.

What steps will reproduce the problem?
1. ConfusionMatrix cf = new ConfusionMatrix(1, 1, 2, 6);

cf.MatthewsCorrelationCoefficient should be 0.218, but is 0.327.

I tested and confirmed that metrics.matthews_corrcoef from sklearn also returns 
0.218 for the same input.

2. ConfusionMatrix cf = new ConfusionMatrix(100, 100, 200, 600);

cf.MatthewsCorrelationCoefficient should be 0.218, but is NaN.

What version of the product are you using? On what operating system?
2.10.0

Replacing the body of the property with the following snippet fixes these 
problems for me:

long tp = truePositives;
long tn = trueNegatives;
long fp = falsePositives;
long fn = falseNegatives;

long numerator = (tp * tn) - (fp * fn);
double denominator = Math.Sqrt((tp + fp) * (tp + fn) *
                               (tn + fp) * (tn + fn));

if (denominator != 0.0)
    return numerator / denominator;
else
    return 0.0;

Original issue reported on code.google.com by Brian.Drupieski on 12 Oct 2013 at 7:00

GoogleCodeExporter commented 9 years ago
Thanks for reporting the issue. Will be fixed on the next release.

Original comment by cesarso...@gmail.com on 18 Oct 2013 at 9:03

GoogleCodeExporter commented 9 years ago
Fixed in 2.11

Original comment by cesarso...@gmail.com on 27 Oct 2013 at 4:52