Open sebastian-lapuschkin opened 3 years ago
Dear all,
I had to make some smaller changes here and there, some while setting up the (1) python environment, some in the (2) notebook itself to get the code to run. Changes are posted below in brief:
(1) changes during setup:
export CONDA_ALWAYS_YES="true"
# note: this is a fresh conda install.
conda create -n shap
conda activate shap
conda install -c conda-forge shap
# install further required packages and software
# packages installed via pip could not be resolved via conda
conda install jupyter
pip install xgboost
conda install keras
pip install lifelines
conda install mpld3 # relevant for NHANES Nonlinearity
conda install statsmodels # relevant for NHANES Nonlinearity
# version mismatch between notebook and shap env
# conda remove scikit-learn
# conda install scikit-learn=0.19.1
# was in the end resolved by adapting the notebook code.
unset CONDA_ALWAYS_YES
(2) changes in the notebook:
#from sklearn.preprocessing import StandardScaler, Imputer #deprecated: v0.19
from sklearn.preprocessing import StandardScaler # v0.24.1
from sklearn.impute import SimpleImputer # v0.24.1 replaces Imputer from v0.19
I will upload the updated notebook(s) here once I have verified everything works out. That being said, is there any rough estimate available after how much time results can be expected from the TreeExplainer (wrt. #cores/cpu clock speed)? I am running the code right now on a XEON server CPU with 20 (logical) cores.
best,
FYI the fixed notebook. some weirdness in the order of cells remains, which should be resolvable by following the order of execution in the original notebook. Note that the results diverge marginally.
... and version mismatches.
I installed the listed packages into a conda env called "shap" using
conda install -c conda-forge shap
. After cloninghttps://github.com/suinleelab/treeexplainer-study
, installing and running jupyter lab / jupyter notebook, I receive the following error in the first (import) block ofnotebooks/mortality/NHANES I Analysis.ipynb
This seems to be caused by the notebook relying on the scikit learn release 0.19.1 (or older; etc.) while the described installation routine has been updated in the mean time, installing scikit learn release 0.24. Is there any easy and convenient fix to install the required packages to run the experiments in this repo?
Thank you for making your code publicly available.