initial attempt to implementing hard majority voting

TorchEnsemble-Community / Ensemble-Pytorch

A unified ensemble framework for PyTorch to improve the performance and robustness of your deep learning model.

https://ensemble-pytorch.readthedocs.io

BSD 3-Clause "New" or "Revised" License

1.09k stars 95 forks source link

initial attempt to implementing hard majority voting #126

Closed gardberg closed 2 years ago

gardberg commented 2 years ago

This is a first start to implementing hard majority voting, as mention in #118.

For a brief description of soft and hard voting, see sklearn's definition.

voting{‘hard’, ‘soft’}, default=’hard’

If ‘hard’, uses predicted class labels for majority rule voting. Else if ‘soft’, predicts the class label based on the argmax of the sums of the predicted probabilities, which is recommended for an ensemble of well-calibrated classifiers.

For hard voting, the vector proba returned by the forward method of the VotingClassifier class now instead becomes a one-hot vector with a 1 on the class which had the majority of the base_estimators votes.

Worth considering is that torch.max is called in the evaluate method which essentially just gets the index of the 1 in proba from the forward method, which works, but maybe isn't a problem?

Any suggestions for improvements are very welcome!

xuyxu commented 2 years ago

Thanks @LukasGardberg, dont worry about the CI problem. I will leave the detailed comment as soon as possible.

gardberg commented 2 years ago

Yes, a bit, but never with pytest. I'm thinking I'll just imitate the tests that already exists. Is there anything specific I should test for?

Also, when running the tests locally I seem to be getting an error:

AttributeError: 'NeuralForestClassifier' object has no attribute 'voting_strategy'

I tried adding an init method to the NeuralForestClassifier class, but am a bit unsure how to deal with having two superclasses, and how to properly handle this problem. Do you have any advice for how to fix it :)? Sorry for needing some guidance haha

xuyxu commented 2 years ago

Thanks for your great work @LukasGardberg ! Kind of busy these days, and I will take a look in a few days.

xuyxu commented 2 years ago

Meanwhile, please click the Details button of the CI torchensemble-CI / build (ubuntu-latest, 3.7) (pull_request) and see pytest results, since the neural forest class inherits from voting, its __init__ method should be modified accordingly.

xuyxu commented 2 years ago

@all-contributors please add @LukasGardberg for code

allcontributors[bot] commented 2 years ago

@xuyxu

I've put up a pull request to add @LukasGardberg! :tada:

gardberg commented 2 years ago

Okay, now hard majority has been moved to ops. I also added a simple test, and added soft voting to NeuralForestClassifier and SnapshotEnsembleClassifier, but they have not been tested.

Are there any other models I should add hard voting for? I'm assuming it's only applicable to classifiers right?

Also, I'm unsure if I solved the init inheritance problem properly, but the tests are at least passing now @xuyxu :)

xuyxu commented 2 years ago

Merged, thanks @LukasGardberg

gardberg commented 2 years ago

Thanks a lot @xuyxu for all the help! Exciting making my first contribution :)