EducationalTestingService / skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
http://skll.readthedocs.org
Other
551 stars 67 forks source link

Correlation metrics not working as expected for binary classification #545

Closed desilinguist closed 5 years ago

desilinguist commented 5 years ago

Back when we wrote an early version of SKLL, we wanted a way to use correlation metrics as tuning objectives for binary classification problems in such a way that we could use the probability of the positive class as the input to the correlation function rather than just the label itself. This should yield a more informative value for the objective.

The way we we achieved this was as follows: before starting the grid search, if we see that we had a correlation metric and a classifier, we monkey-patch the estimator being passed to the GridSearchCV instance - replacing its predict() method with the custom _predict_binary() function. When this function is called, it checks whether the classification being effected is binary and if so, it uses the probability of the positive class as the prediction and. If it's not binary, it outputs the most likely label as the prediction.

However, this monkey-patching is no longer working as this notebook shows. Basically, even for the binary case, the most likely label is being used when computing pearson.

To fix this, we should no longer need this monkey-patching. We can simply do the following:

With these modifications, all use cases with pearson as the objective function will work seamlessly for the user:

Case 1: binary classification, probability option enabled.

Case 2: binary classification, probability option not enabled.

Case 3: multi-class classification, probability option enabled.

Case 4: multi-class classification, probability option not enabled.

Case 5: regression

We can do the same for the other two correlation metrics as well (spearman and kendalltau).

@aoifecahill @mulhod @bndgyawali @jbiggsets does all of this make sense?

mulhod commented 5 years ago

I think this makes sense. It's not working as it is and it's hacky. Your fix seems like it will fix both of those issues.

mulhod commented 5 years ago

Could we include basically the test you have written to ensure that future changes don't break this functionality?

desilinguist commented 5 years ago

Yup, that's my plan!

desilinguist commented 5 years ago

Addressed by #551