Closed tupini07 closed 5 years ago
After some testing it seems that:
sklearn.VotingClassifier
can only be composed of classifiers which have a predict_proba
method (meaning we can't use LSVM
) VotingClassifier
is very similar with that of our method union average
(which does use LSVM). It actually has a slightly better performance with similar standard deviation (see table below)However, there seems to be an issue with our method when removing LSVM
from the pool of classifiers: the performance in general greatly decreases, as can be seen below. Rather than debugging why this may be happening, we'll just go ahead use sklearn's implementation. And since we won't be using LSVM for this ensemble we'll add a new classifier to the pool so as to maintain diversity (issue #365)
Method | Precision(Std) | Recall(Std) | F1(Std) |
---|---|---|---|
Union Average | .895(.003) | .967(.001) | .930(.001) |
Union Average (no LSVM) | .303(.010) | .515(.012) | .381(.010) |
VotingClassifier | .907(.007) | .972(.006) | .938(.001) |
NOTE: these metrics were obtained by running the evaluate
procedure using discogs/musician
as target
After reading the documentation for sklearn.VotingClassifier it seems that it does exactly what our implementation of super confident predictions (#305) does. The only difference is that sets of predictions are always joined by union.
This task is to evaluate the performance of
sklearn.VotingClassifier
. If it is better or the same as our current implementation then it would be a good idea to replace our implementation with this one. It will serve to reduce the amount of code in the project and we won't need to maintain this functionality.