shankarpandala / lazypredict

Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
MIT License
3.03k stars 344 forks source link

Cannot run example as shown in the docs #429

Open AndreyRub opened 1 year ago

AndreyRub commented 1 year ago

Describe the bug When running the example exactly as shown in the documentation, it returns an error

To Reproduce Steps to reproduce the behavior:

  1. Python = 3.7.0
  2. pip install lazypredict
  3. Call the code exactly as in documentation:
from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y= data.target

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)

clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

print(models)

The result is:

IndexError                                Traceback (most recent call last)
[***) in 
     10 
     11 clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
---> 12 models,predictions = clf.fit(X_train, X_test, y_train, y_test)
     13 
     14 print(models)

(file:///C:/Users/andre/.conda/envs/***/lib/site-packages/lazypredict/Supervised.py) in fit(self, X_train, X_test, y_train, y_test)
    262 
    263         categorical_low, categorical_high = get_card_split(
--> 264             X_train, categorical_features
    265         )
    266 

(file:///C:/Users/andre/.conda/envs/****/lib/site-packages/lazypredict/Supervised.py) in get_card_split(df, cols, n)
    131     """
    132     cond = df[cols].nunique() > n
--> 133     card_high = cols[cond]
    134     card_low = cols[~cond]
    135     return card_low, card_high

(file:///C:/Users/andre/.conda/envs/****/lib/site-packages/pandas/core/indexes/base.py) in __getitem__(self, key)
   4282 
...
-> 4284         result = getitem(key)
   4285         if not is_scalar(result):
   4286             return promote(result)

IndexError: arrays used as indices must be of integer (or boolean) type

Expected behavior As shown in documentation

Desktop (please complete the following information):

Additional context This might be due to older version of scikit-learn - 0.24.2

Libardo1 commented 10 months ago

image

It is not working in Colab. Works fine with breast cancer dataset