gitter-lab / ml4bio

A graphical interface for sklearn classification to introduce machine learning to biologists
MIT License
11 stars 2 forks source link

Averaging precision and recall #37

Open agitter opened 3 years ago

agitter commented 3 years ago

When there are two class labels, precision and recall (and possibly other metrics) are reported in a non-intuitive way. There is no inherent positive class, so each class is considered to be the positive class and then the two metric values are averaged. This is especially confusing for workshop participants when we have a classifier that predicts all instances to be in the same class.

One solution may be to support indicating the positive class in the GUI or the dataset. Another would be to pick one class to be positive in a deterministic way, such as sorting the class labels.