scikit-learn-contrib / imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
https://imbalanced-learn.org
MIT License
6.85k stars 1.29k forks source link

Allowing fit_resample of a Pipe to return even if last step has no fit_resample method [ENH] #1037

Closed fingoldo closed 1 year ago

fingoldo commented 1 year ago

Hi, is there any particular reason for imblearn pipeline to return None when last step has no fit_resample method?

This is how fit_resample method currently ends:

if hasattr(last_step, "fit_resample"):
        return last_step.fit_resample(Xt, yt, **fit_params_last_step)

But I am trying to use imblearn.Pipeline in a general setup where resampling is optional.

Describe the solution you'd like

if hasattr(last_step, "fit_resample"):
        return last_step.fit_resample(Xt, yt, **fit_params_last_step)
else:
        last_step.fit(Xt, yt, **fit_params_last_step)
        return Xt, yt

Describe alternatives you've considered

Subclassing of imblearn.Pipeline. But i'm sure many folks out there will benefit from implementing this change.

glemaitre commented 1 year ago

In this case you should only call fit because you only want to fit the last step (and apply previous resampling if any).