analyticalmindsltd / smote_variants

A collection of 85 minority oversampling techniques (SMOTE) for imbalanced learning with multi-class oversampling and model selection features
http://smote-variants.readthedocs.io
MIT License
623 stars 138 forks source link

Error: Dimension of X_train and y_train is not the same ! #76

Open ReemMOwn opened 1 year ago

ReemMOwn commented 1 year ago

I am getting this error when trying to use any sampler from smote_variants, my binary dataset has 30 input features and one output X_train is ndarray with shape (227845, 30) y_train is ndarray with shape (227845, 1)

/usr/local/lib/python3.10/dist-packages/smote_variants/oversampling/_mwmote.py in sampling_algorithm(self, X, y) 498 return self.return_copies(X, y, "Sampling is not needed") 499 --> 500 X_min = X[y == self.min_label] 501 502 nn_params= {**self.nn_params}

IndexError: boolean index did not match indexed array along dimension 1; dimension is 30 but corresponding boolean dimension is 1

Here's sample of my code: X_train, X_test, y_train, y_test = split_data(df, 0.2) import smote_variants as sv sampler = sv.MWMOTE() X_resampled, y_resampled = sampler.sample(X_train, y_train)

gykovacs commented 11 months ago

Thank you for raising, I look into it.

gykovacs commented 11 months ago

I think the problem is that your y_train should be an array of shape (227845), that is, instead of a 2D array with the spatial extent of 1 in the second dimension, it should be a 1D array.