scikit-learn-contrib / imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
https://imbalanced-learn.org
MIT License
6.85k stars 1.29k forks source link

Cannot find reference for the one vs. rest scheme used to extend many algorithms for the multi-class case #1039

Open EssamWisam opened 1 year ago

EssamWisam commented 1 year ago

In the docs, it's frequently mentioned in the references

Supports multi-class resampling. A one-vs.-rest scheme is used when sampling a class as proposed in [1].

So far, every time I read the referenced paper there was no one-vs-rest scheme described or an extension to multi-class whatsoever. Take for instance, TomekLinks, CondensedNearestNeighbors, EditedNearestNeighbors.

I clearly understand how one-vs-rest works for classification models but I am not sure how is it used to solve the same issue for oversampling or undersampling binary class models in imbalanced-learn. Assuming that the multi-class extensions are indeed not present in the papers it would be nice if this is explained in the docs.

glemaitre commented 1 year ago

When no references are given, it is indeed because the original paper does not discuss this matter (which is most of the cases). We could indeed expand the documentation there.

From the top of the head, one-vs-rest would refer to using the current class vs. all other classes in the nearest neighbors search when cleaning while the paper would mention something like minority vs. majority in the binary case.