Open qemtek opened 3 years ago
Hello,
I have the same issue using a train dataset with 125K rows. I'm training the models on google colaboratory with12G ram available. Runtime crashes on 38% prompting a huge amount of allocated memory. Did you find any workarounds for this issue?
Thanks in advance.
A workaround is to filter out high memory model architectures from the default regressors / classifiers list and to pass that custom list of models to the LazyRegressor / LazyClassifier. For example:
import lazypredict
from lazypredict.Supervised import LazyRegressor
highmem_regressors = [
"GammaRegressor", "GaussianProcessRegressor", "KernelRidge", "QuantileRegressor"
]
regressors = [reg for reg in lazypredict.Supervised.REGRESSORS if reg[0] not in highmem_regressors]
reg = LazyRegressor(regressors=regressors, verbose=1, ignore_warnings=True, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)
This worked for me, I was using Google collab 8GB RAM
highmem_classifiers = ["LabelSpreading","LabelPropagation","BernoulliNB","KNeighborsClassifier", "ElasticNetClassifier", "GradientBoostingClassifier", "HistGradientBoostingClassifier"]
# Remove the high memory classifiers from the list
classifiers = [c for c in lazypredict.Supervised.CLASSIFIERS if c[0] not in highmem_classifiers]
clf = LazyClassifier(classifiers=classifiers, verbose=1, ignore_warnings=True, custom_metric=None)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)
model_dictionary = clf.provide_models(X_train, X_test, y_train, y_test)
models
Describe the bug Using a dataset with 500k rows and 27 features, I ran into a huge memory issue on iteration 12/30. Screenshot included so you can see how much memory was being used.
Desktop (please complete the following information):
Additional context Other packages installed
awswrangler==2.4.0 pandas==1.2.1 numpy==1.20.0 scikit-learn==0.23.1 sqlalchemy==1.3.23 psycopg2-binary==2.8.6 lazypredict==0.2.7 tqdm==4.56.0 xgboost==1.3.3 lightgbm==3.1.1 pytest==6.2.2 imblearn shap==0.38.1 matplotlib==3.3.4 ipython