maciejkula / rustlearn

Machine learning crate for Rust
Apache License 2.0
619 stars 53 forks source link

Correct the ROC AUC computations and add relevant tests. #35

Closed potocpav closed 7 years ago

potocpav commented 7 years ago

The old ROC AUC computations were wrong (at least) in cases of duplicate y_hat values. I added a test demonstrating said issue and fixed the computations.

The table below summarizes the resulting AUCs for each test. Tests 0,1 and tests 2,3 differ only in data-point order, so they should obviously return the same AUCs. I checked the correctness of the new values on paper.

EDIT: also checked using Python's sklearn.metrics.roc_auc_score

test old AUC new AUC
0 Ok(1) Ok(0.75)
1 Ok(0.25) Ok(0.75)
2 Ok(0.625) Ok(0.875)
3 Ok(1) Ok(0.875)
4 Ok(0.16666666) Ok(0.5)
5 Ok(NaN) Ok(0.25)
maciejkula commented 7 years ago

Thanks for the fix!

potocpav commented 7 years ago

I saw your different coding style and tried at first to just correct your code, but then I got lazy and copied-and-pasted the code from my own project... So yeah, is this better?

maciejkula commented 7 years ago

Great, thanks a lot!