scikit-learn-contrib / imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
https://imbalanced-learn.org
MIT License
6.85k stars 1.28k forks source link

EasyEnsemble naming is misleading #442

Closed gadcam closed 6 years ago

gadcam commented 6 years ago

In the documentation we can find here http://contrib.scikit-learn.org/imbalanced-learn/stable/generated/imblearn.ensemble.EasyEnsemble.html#r6767 that

The method is described in [R6767]. [...] [R6767] | (1, 2) X. Y. Liu, J. Wu and Z. H. Zhou, “Exploratory Undersampling for Class-Imbalance Learning,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 2, pp. 539-550, April 2009.

At the page 4 of this paper we can find the description of this algorithm

image

But, if I am not mistaking, AdaBoost is never used (and no classifier even).

Could you correct me if I said something wrong ?

If am right I think the paper should be removed from the description of the method. Moreover, a specific note should be added to say explicitly that the implementation is not about this algorithm but that it is a "common sense" name.

glemaitre commented 6 years ago

Ups I see that my comment did not come here. We actually want to deprecate EasyEnsemble and BalanceCascade as they are. They should be meta-estimator. Regarding EasyEnsemble you can refer to the BalancedBaggingClassifier which is actually allowing for a very similar classifier using Bagging instead of boosting. We might want to change the EasyEnsemble to modify the EasyEnsemble to actually follow the boosting rule show in the algorithm.