shankarpandala / lazypredict

Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
MIT License
2.87k stars 333 forks source link

An option to supply a list of models to LazyRegressor/LazyClassifier? #376

Closed naeemmrz closed 1 year ago

naeemmrz commented 2 years ago

Is your feature request related to a problem? Please describe. Based on the nature of the datasets I work with, it's evident to me that certain regressor will always be a bad choice (for example, Lars, Lasso Ridge, etc), however, Lazypredict (LazyRegressor to be precise) will still take the time to test these regressors. When screening multiple-large datasets, the constraint on time and resources adds up (quite a lot actually, for some context, my datasets are often 50K+ rows).

Describe the solution you'd like Is it possible to add an argument to LazyRegressor/LazyClassifier where it will take a list of algorithms (something like models2test = ['RandomForestRegressor', 'XGBRegressor', 'KNeighborsRegressor', 'LGBMRegressor') and only tests the models in the list while ignoring the rest?

Describe alternatives you've considered The way I currently tackle this sort-of-limitation is that I run LazyRegressor only to a portion of my dataset (often <half) and use the lazypredict results as a starting point. This works sometimes but it fails to generalize if my datasets are not fairly diverse, hence not utilizing the full potential of Lazypredict.

lpantano commented 2 years ago

I had the same issue and for the version I am using (0.2.9), if you pass a list with the classes it works.

This will run the first 5 models

keep=[est[1] for est in all_estimators()]
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None,classifiers=keep[1:5])
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

you can filter the all_estimators() list for the models you want.

shankarpandala commented 1 year ago

We already have that feature. I understood that lack of documentation is the reason for your situation.