Closed malmashhadani-88 closed 1 year ago
When I run SMOTE() on a dataset, I get TypeError due to Numpy operator
from imblearn.over_sampling import SMOTE x_train_os, y_train_os = SMOTE().fit_resample(x_train, y_train)
No error is thrown
TypeError Traceback (most recent call last) Cell In[139], line 2 1 from imblearn.over_sampling import SMOTE ----> 2 x_train_os, y_train_os = SMOTE().fit_resample(x_train, y_train)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/base.py:208, in BaseSampler.fit_resample(self, X, y) 187 """Resample the dataset. 188 189 Parameters (...) 205 The corresponding label of X_resampled. 206 """ 207 self._validate_params() --> 208 return super().fit_resample(X, y)
X_resampled
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/base.py:112, in SamplerMixin.fit_resample(self, X, y) 106 X, y, binarize_y = self._check_X_y(X, y) 108 self.samplingstrategy = check_sampling_strategy( 109 self.sampling_strategy, y, self._sampling_type 110 ) --> 112 output = self._fitresample(X, y) 114 y = ( 115 label_binarize(output[1], classes=np.unique(y)) if binarizey else output[1] 116 ) 118 X, y_ = arraystransformer.transform(output[0], y)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:365, in SMOTE._fit_resample(self, X, y) 363 self.nnk.fit(X_class) 364 nns = self.nnk.kneighbors(X_class, return_distance=False)[:, 1:] --> 365 X_new, y_new = self._make_samples( 366 X_class, y.dtype, class_sample, X_class, nns, n_samples, 1.0 367 ) 368 X_resampled.append(X_new) 369 y_resampled.append(y_new)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:119, in BaseSMOTE._make_samples(self, X, y_dtype, y_type, nn_data, nn_num, n_samples, step_size) 116 rows = np.floor_divide(samples_indices, nn_num.shape[1]) 117 cols = np.mod(samples_indices, nn_num.shape[1]) --> 119 X_new = self._generate_samples(X, nn_data, nn_num, rows, cols, steps) 120 y_new = np.full(n_samples, fill_value=y_type, dtype=y_dtype) 121 return X_new, y_new
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:163, in BaseSMOTE._generate_samples(self, X, nn_data, nn_num, rows, cols, steps) 123 def _generate_samples(self, X, nn_data, nn_num, rows, cols, steps): 124 r"""Generate a synthetic sample. 125 126 The rule for the generation is: (...) 161 Synthetically generated samples. 162 """ --> 163 diffs = nn_data[nn_num[rows, cols]] - X[rows] 165 if sparse.issparse(X): 166 sparse_func = type(X).name
TypeError: numpy boolean subtract, the - operator, is not supported, use the bitwise_xor, the ^ operator, or the logical_xor function instead.
-
^
System: python: 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0] executable: /anaconda/envs/azureml_py310_sdkv2/bin/python machine: Linux-5.15.0-1038-azure-x86_64-with-glibc2.31
Python dependencies: sklearn: 1.3.0 pip: 23.2.1 setuptools: 68.1.2 numpy: 1.24.4 scipy: 1.10.1 Cython: 0.29.35 pandas: 2.1.0 matplotlib: 3.7.3 joblib: 1.2.0 threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info: user_api: openmp internal_api: openmp prefix: libgomp filepath: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0 version: None num_threads: 4
user_api: blas
internal_api: openblas prefix: libopenblas filepath: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so version: 0.3.18 threading_layer: pthreads architecture: SkylakeX num_threads: 4
Describe the bug
When I run SMOTE() on a dataset, I get TypeError due to Numpy operator
Steps/Code to Reproduce
Expected Results
No error is thrown
Actual Results
TypeError Traceback (most recent call last) Cell In[139], line 2 1 from imblearn.over_sampling import SMOTE ----> 2 x_train_os, y_train_os = SMOTE().fit_resample(x_train, y_train)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/base.py:208, in BaseSampler.fit_resample(self, X, y) 187 """Resample the dataset. 188 189 Parameters (...) 205 The corresponding label of
X_resampled
. 206 """ 207 self._validate_params() --> 208 return super().fit_resample(X, y)File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/base.py:112, in SamplerMixin.fit_resample(self, X, y) 106 X, y, binarize_y = self._check_X_y(X, y) 108 self.samplingstrategy = check_sampling_strategy( 109 self.sampling_strategy, y, self._sampling_type 110 ) --> 112 output = self._fitresample(X, y) 114 y = ( 115 label_binarize(output[1], classes=np.unique(y)) if binarizey else output[1] 116 ) 118 X, y_ = arraystransformer.transform(output[0], y)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:365, in SMOTE._fit_resample(self, X, y) 363 self.nnk.fit(X_class) 364 nns = self.nnk.kneighbors(X_class, return_distance=False)[:, 1:] --> 365 X_new, y_new = self._make_samples( 366 X_class, y.dtype, class_sample, X_class, nns, n_samples, 1.0 367 ) 368 X_resampled.append(X_new) 369 y_resampled.append(y_new)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:119, in BaseSMOTE._make_samples(self, X, y_dtype, y_type, nn_data, nn_num, n_samples, step_size) 116 rows = np.floor_divide(samples_indices, nn_num.shape[1]) 117 cols = np.mod(samples_indices, nn_num.shape[1]) --> 119 X_new = self._generate_samples(X, nn_data, nn_num, rows, cols, steps) 120 y_new = np.full(n_samples, fill_value=y_type, dtype=y_dtype) 121 return X_new, y_new
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/imblearn/over_sampling/_smote/base.py:163, in BaseSMOTE._generate_samples(self, X, nn_data, nn_num, rows, cols, steps) 123 def _generate_samples(self, X, nn_data, nn_num, rows, cols, steps): 124 r"""Generate a synthetic sample. 125 126 The rule for the generation is: (...) 161 Synthetically generated samples. 162 """ --> 163 diffs = nn_data[nn_num[rows, cols]] - X[rows] 165 if sparse.issparse(X): 166 sparse_func = type(X).name
TypeError: numpy boolean subtract, the
-
operator, is not supported, use the bitwise_xor, the^
operator, or the logical_xor function instead.Versions
System: python: 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0] executable: /anaconda/envs/azureml_py310_sdkv2/bin/python machine: Linux-5.15.0-1038-azure-x86_64-with-glibc2.31
Python dependencies: sklearn: 1.3.0 pip: 23.2.1 setuptools: 68.1.2 numpy: 1.24.4 scipy: 1.10.1 Cython: 0.29.35 pandas: 2.1.0 matplotlib: 3.7.3 joblib: 1.2.0 threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info: user_api: openmp internal_api: openmp prefix: libgomp filepath: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0 version: None num_threads: 4
internal_api: openblas prefix: libopenblas filepath: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so version: 0.3.18 threading_layer: pthreads architecture: SkylakeX num_threads: 4