Closed PhilipMay closed 3 years ago
@PhilipMay Hi,
Yes, 'soft' voting is usually better than 'hard' voting. So far we have only considered hard voting (majority voting) just as a matter of simplicity. However, now that the library is becoming more mature, we have plans to allow other combination methods based on probability estimates in the next releases.
In fact, we already implemented, on the utils. aggregation
module, some utility routines for standard combination methods based on probabilities (averaging, max, product etc...). So one of the plan for future releases is to allow the user to choose the combination method that should be used to aggregate the outputs of the pool of classifiers instead of aways using the "normal" voting scheme.
Hi @Menelau did you add any soft voting kind of behavior since your last answer? Thanks Philip
Hello @PhilipMay ,
I have it on a development branch but have not pushed to master yet. I haven't decided on the best way to allow this option this functionality.
One option would be to mimic sklearn VotingClassifier which allows either 'hard' or 'soft' options for voting. That has the benefit of being pretty straightforward and easy to use (which I like) but would not allow different aggregation methods at the end (product, median, max, etc).
Another option is to allow the user to pass a string indicating any of the aggregation functions to use (having its default value to hard voting) which is more flexible but increases in complexity (and I'm not sure if users will really benefit from this feature).
Do you have any preference or suggestion? Settling on that I can send a PR with this functionality tomorrow.
What about soft voting? Is it implemented? Why is this closed?
Well it was closed for the lack of activity. My last comment was on May 12 and did not get any response since.
Ahh ok.
I have it on a development branch but have not pushed to master yet.
I thought you were developing it.
Hi, I saw that your DES implementations (KNORAU, KNORAE and DESP) are using
.predict
of thepool_classifiers
. But.predict
is only returning 0 or 1 at a binary classifier.Why dont you call
.predict_proba
to get probabilities and work with them?I mean with "normal" voting ensembles hard voting is not as good as soft voting. Isnt that the same case here? Would it lead to a besser result if you work with probabilities instead of binary values in this case?
Would be happy about a short explanation.
PS: Thanks a lot for your nice library and your contribution!
Thanks Philip