scikit-learn-contrib / imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
https://imbalanced-learn.org
MIT License
6.86k stars 1.29k forks source link

[ENH] Add Geometric-SMOTE to imbalanced-learn #881

Open joaopfonseca opened 2 years ago

joaopfonseca commented 2 years ago

I'm opening this issue to discuss the possible inclusion of Geometric-SMOTE, proposed by Douzas and Bacao in this paper, in the imbalanced-learn library. The oversampler was already implemented by Georgios Douzas in this repository. It is compatible with the scikit/imbalanced-learn libraries and contains a test suite similar to the ones that already exist for SMOTE-based oversamplers. In addition, his implementation has a MIT license.

Considering that this oversampler is essentially a generalization of the generation mechanism of SMOTE (in fact, given specific hyperparameters, it mimics the behavior of SMOTE) that appears to have a consistent performance, I believe it would be a nice addition to this library.

I recently discussed this idea with both authors, which also approved the idea.

Describe the solution you'd like

Inclusion of the Geometric-SMOTE oversampler in the imbalanced-learn library. I would be happy to do this. I will make a PR referencing this issue soon. Please let me know if there is any additional information I should consider before proceeding.

joaopfonseca commented 2 years ago

Have to make a few changes to the original implementation in order to have it pass imblearn's tests, will open a PR once I'm done with it.