mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
2.97k stars 392 forks source link

ufunc 'isnan' not supported for the input types #333

Open JaguarPaw2409 opened 3 years ago

JaguarPaw2409 commented 3 years ago

Version: 0.9.1

Traceback:

1) There are no null values in the dataframe. 2) datatypes are either boolean or float.

image


ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Traceback (most recent call last):
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/base_automl.py", line 1046, in _fit
    trained = self.train_model(params)
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/base_automl.py", line 354, in train_model
    mf.train(results_path, model_subpath)
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/model_framework.py", line 147, in train
    X_train, y_train, sample_weight = self.preprocessings[
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/preprocessing/preprocessing.py", line 216, in fit_and_transform
    X_train = convert.transform(X_train)
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/preprocessing/preprocessing_categorical.py", line 79, in transform
    X.loc[:, column] = lbl.transform(X.loc[:, column])
  File "miniconda3/envs/ml/lib/python3.8/site-packages/supervised/preprocessing/label_encoder.py", line 33, in transform
    return self.lbl.transform(x)  # list(x.values))```
pplonski commented 3 years ago

@JaguarPaw2409 thank you for reporting the problem. Could you please provide a data sample? Or code to create such a data sample? It will be easier to reproduce the bug.

JaguarPaw2409 commented 3 years ago
supervised/preprocessing/preprocessing_categorical.py

line no 40:
PreprocessingUtils.get_type(X[column]) != PreprocessingUtils.CATEGORICAL

Boolean datatypes are treated as categorical variables and LabelEncoder was not working for them. I resolved the issue by converting all boolean datatypes to integers.