rhiever / sklearn-benchmarks

A centralized repository to report scikit-learn model performance across a variety of parameter settings and data sets.
MIT License
210 stars 54 forks source link

Useful metrics to gather for the benchmark #9

Closed rhiever closed 7 years ago

rhiever commented 8 years ago

Currently, we're only calculating balanced accuracy for the supervised classification tasks. What would be useful metrics to gather for the benchmark?

AkshayVarik commented 8 years ago

Randy, how about sensitivity, specificity ? For the classification problems? Misclassification error is what we find out currently.. Also we can plot the ROC curves..

rhiever commented 8 years ago

I've been thinking about other metrics, such as AUC (area under the ROC curve), and perhaps those other metrics you mention may be useful also.

AkshayVarik commented 8 years ago

Ya once we plot the ROC ,AUC will naturally reflect from that I believe.. Also a chi squared test could be done, but that would mean we fit models with different predictors to understand which of them don't add to the residual deviance much..basically will help in dimensionality reduction.. I believe we haven't done that at all..I was wondering if the variables that fall within a model could influence our choice of parameters.. Just a thought that I have..