Describe the bug
According to the scikit-learn classifier glossary entry, every classifier should accept non-numerical classes, such as labels, and have a classes_ attribute after fit, with the mapping from class names to attributes. As we want to imitate their API, this should be done also for our classifiers.
Currently we have an internal utility function _classifier_get_classes that uses LabelEncoder to get that mapping, but in some cases it is not used and in others the attribute is not exposed with that name.
The type hints should also be fixed to take that into account. Ideally, we should attempt to leverage the type system to give an error if we forget to call _classifier_get_classes, for future classifiers.
We should check each classifier and fix them in separate PRs when possible:
[x] KNeighborsClassifier
[x] RadiusNeighborsClassifier
[x] NearestCentroid
[x] DTMClassifier
[x] MaximumDepthClassifier
[x] DDClassifier
[x] DDGClassifier
[x] LogisticRegression
[x] QuadraticDiscriminantAnalysis
We should check that this behaviour is correct in each classifier. For that purpose, a test should be made that can then be called for each estimator. This can be done with either unittest or pytest, whatever is easier.
Describe the bug According to the scikit-learn classifier glossary entry, every classifier should accept non-numerical classes, such as labels, and have a
classes_
attribute afterfit
, with the mapping from class names to attributes. As we want to imitate their API, this should be done also for our classifiers.Currently we have an internal utility function
_classifier_get_classes
that usesLabelEncoder
to get that mapping, but in some cases it is not used and in others the attribute is not exposed with that name.The type hints should also be fixed to take that into account. Ideally, we should attempt to leverage the type system to give an error if we forget to call
_classifier_get_classes
, for future classifiers.We should check each classifier and fix them in separate PRs when possible:
We should check that this behaviour is correct in each classifier. For that purpose, a test should be made that can then be called for each estimator. This can be done with either unittest or pytest, whatever is easier.