isdsucph / isds2022

Introduction to Social Data Science 2022 - a summer school course https://isdsucph.github.io/isds2022/
MIT License
21 stars 23 forks source link

SHAP value memory problem #29

Open AlexsandarB opened 2 years ago

AlexsandarB commented 2 years ago

Hi I am having a problem trying to create SHAP values with the code below, the program stops running because its unable to locate enough memory with the memory needed to be located is over 35gb. image

Magnus-Nielsen commented 2 years ago

Hi

You could try using the explainer for linear models, which might be more efficient given the narrower usage compared to the general Explainer.

/Magnus

Magnus-Nielsen commented 2 years ago

Alternatively, you might benefit from playing around with the min_df parameter in the TfidfVectorizer, which can drastically reduce the amount of features it returns.

/Magnus