Closed m-makarious closed 4 years ago
Have you ever seen this happen with train?
Hey Mike, there was a similar-but-not-exactly-the-same issue where the best algorithm that was nominated in train was not the ROC curve that was being outputted.
This issue is closed now, but you can find more information here or if there are any additional issues: https://github.com/GenoML/genoml2/issues/9
Thinking there's a lot of similarities between the two issues, however!
I think the bug is some algorithms are calculatting AUC using predict
versus predict_proba
in a few instances.
Was only able to re-engineer this issue twice on major 5% difference using 'SGDClassifier' and one less than 0.5% difference using LinearDiscriminantAnalysis
which in the latter case could have been a rounding error.
I'll keep digging but if you could look into this as well @m-makarious would be great, thanks!!!
We are defining the AUC in two different ways. I prefer using sklearn metrics default.
See lines with rocauc = metrics.roc_auc_score(self.y_test, test_predictions)
from sklearn instead of roc_auc = metrics.auc(fpr, tpr)
.
Want me to redo the plots sticking to metrics.roc_auc_score
?
Luckily think this will solve the problem!
Sounds good to me! Let me know how that works out ☺️
roc_auc = auc(fpr, tpr)
needs to be changed to roc_auc = metrics.roc_auc_score(self.y_test, test_predictions)
Changed the test and train script to use the sklearn metrics default (roc_auc = metrics.roc_auc_score(self.y_test, test_predictions)
in place of roc_auc = auc(fpr, tpr)
), but this - at least on my end - did not resolve the issue of the inconsistency in reporting between the ROC and the perfomance metrics generated by the test script.
Will keep investigating!
Weird. Let me know what you find. I’m doing some batch testing to see.
Perhaps an embarrassingly simple fix, but the following changes have been implemented:
roc_auc = metrics.roc_auc_score(self.y_test, test_predictions)
as @mikeDTI pointed outIssue should be fixed now - but let me know if you run into additional issues @h-leonard !
Screenshot of (finally) consistent reporting:
Yo! Great work!
Please make sure that this is a bug.
System information:
Describe the current behavior: When testing a model on an unseen dataset (after munging, training, and re-training on shared features)... the output results do not match between the
*.testedModel_allSamples_performanceMetrics.csv
and the.testedModel_allSamples_ROC.png
(see screenshot below for an example of this)Describe the expected behavior: ...they should match!
Code to reproduce the issue: Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Going through the training, harmonize, and testing steps outlined in the README will reproduce the issue. Attached is an image. Thanks for reporting this, @h-leonard!
Other Information / Logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.