SauceCat / PDPbox

python partial dependence plot toolbox
http://pdpbox.readthedocs.io/en/latest/
MIT License
840 stars 129 forks source link

Error on running example #46

Closed janvanrijn closed 3 years ago

janvanrijn commented 5 years ago

When trying to run an example from the docs, I get the following error: https://pdpbox.readthedocs.io/en/latest/pdp_plot.html

/home/janvanrijn/anaconda3/envs/openml-defaults/bin/python /home/janvanrijn/projects/openml-defaults/test2.py
Traceback (most recent call last):
  File "/home/janvanrijn/projects/openml-defaults/test2.py", line 14, in <module>
    feature='Sex')
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/pdpbox/pdp.py", line 159, in pdp_isolate
    for feature_grid in feature_grids)
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/parallel.py", line 983, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/parallel.py", line 825, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/parallel.py", line 782, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 545, in __init__
    self.results = batch()
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/parallel.py", line 261, in __call__
    for func, args, kwargs in self.items]
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/joblib/parallel.py", line 261, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/pdpbox/pdp_calc_utils.py", line 44, in _calc_ice_lines
    preds = predict(_data[model_features], **predict_kwds)
  File "/home/janvanrijn/anaconda3/envs/openml-defaults/lib/python3.6/site-packages/xgboost/sklearn.py", line 797, in predict_proba
    test_dmatrix = DMatrix(data, missing=self.missing, nthread=self.n_jobs)
AttributeError: 'XGBClassifier' object has no attribute 'n_jobs'

my pip freeze:

AnyQt==0.0.8
asn1crypto==0.24.0
Babel==2.6.0
Bottleneck==1.2.1
certifi==2018.8.24
cffi==1.11.5
chardet==3.0.4
click==6.7
cloudpickle==0.5.3
commonmark==0.8.0
ConfigSpace==0.4.7
cryptography==2.3.1
cycler==0.10.0
Cython==0.28.2
dask==0.18.1
debtcollector==1.19.0
decorator==4.3.0
distributed==1.22.0
docutils==0.14
entrypoints==0.2.3
fasteners==0.14.1
feather-format==0.4.0
future==0.16.0
HeapDict==1.0.0
holoviews==1.10.7
idna==2.7
iso8601==0.1.12
jeepney==0.3.1
joblib==0.12.3
keyring==13.2.1
keyrings.alt==3.1
kiwisolver==1.0.1
liac-arff==2.2.2
matplotlib==2.2.2
mkl-fft==1.0.0
mkl-random==1.0.1
monotonic==1.5
msgpack==0.5.6
netaddr==0.7.19
netifaces==0.10.7
networkx==2.1
numpy==1.14.3
Orange3==3.15.0
oslo.concurrency==3.27.0
oslo.config==6.2.1
oslo.i18n==3.20.0
oslo.utils==3.36.2
pandas==0.24.0.dev0+997.ga197837
param==1.7.0
pbr==4.0.4
PDPbox==0.2.0
psutil==5.4.6
PuLP==1.6.8
pyarrow==0.9.0
pycparser==2.18
pyparsing==2.2.0
pyqtgraph==0.10.0
python-dateutil==2.7.2
python-louvain==0.11
pytz==2018.4
pyviz-comms==0.6.0
PyYAML==3.12
requests==2.19.1
rfc3986==1.1.0
scikit-learn==0.20.0
scikit-optimize==0.5.2
scipy==0.19.1
seaborn==0.9.0
SecretStorage==3.1.0
serverfiles==0.2.1
six==1.11.0
sortedcontainers==2.0.4
stevedore==1.28.0
tblib==1.3.2
toolz==0.9.0
tornado==5.0.2
typing==3.6.4
urllib3==1.23
wrapt==1.10.11
xgboost==0.81
xlrd==1.1.0
xmltodict==0.11.0
zict==0.1.3

Code:


from pdpbox import pdp, get_dataset

test_titanic = get_dataset.titanic()
titanic_data = test_titanic['data']
titanic_target = test_titanic['target']
titanic_features = test_titanic['features']
titanic_model = test_titanic['xgb_model']

pdp_sex = pdp.pdp_isolate(model=titanic_model,
                          dataset=titanic_data,
                          model_features=titanic_features,
                          feature='Sex')
fig, axes = pdp.pdp_plot(pdp_isolate_out=pdp_sex, feature_name='sex')

Which XGboost version do I need?

cc @prerna135

tom-dwyer commented 5 years ago

Regarding the first issue, just put in code: titanic_model.n_jobs = 1