Closed drewoldag closed 1 month ago
Chatted with @AmandaWasserman about this, and we'll take the approach of breaking apart the predict
method into predict_class
and predict_probability
such that each will return either the list of classes or the list of probabilities per class for each input data sample.
This will also have the benefit of making it easier to implement subclasses that are not wrappers over sklearn-based classifiers.
This was addressed in PR #54
From what I can see
self.predicted_class
is only ever assigned, but the values are never used. If that's really the case, then we can simplify a lot of code in bothdatabase.py
as well as the API definition for the all the classifiers.Currently the classifiers have a
predict
method that will return 2 arrays 1) the list of predicted classes 2) the list of class probabilities for each input. If we don't need the list of predicted classes there is a lot of code that can be cleaned up and it would reduce the computations needed (currently each prediction is run twice - the predicted classes are just argmax of the class probabilties).