Closed Chris-hughes10 closed 3 years ago
Thanks for opening this @Chris-hughes10.
My initial thought is that you do not have xgboost installed on the cluster if you are using the default daskdev/dask:2021.1.0
.
You should always ensure your remote Python environment matches your local one.
The error you shared also seems to be coming directly from xgboost. Have you raised an issue there?
The problem here could well be a Dask issue, but we may need better error handling in xgboost to identify what it is.
Hi @jacobtomlinson , I can confirm that I did install XGBoost on the cluster and the scheduler, by adding additional pip packages as chart values, and the environments are consistent.
I was unsure where to raise the issue to be honest, as it could be one of several components that is the root cause. I raised it here as it worked using LocalCluster
but I am happy to raise elsewhere.
Doing further experimentation this morning, I have found that the issue does not occur, and training completes successfully, when using dask 2.30.0 , distributed 2.30.1 . So perhaps it is better to raise this as an issue to Dask
Ah fair enough!
Could you share your full helm config so that I can try and reproduce it?
Are you able to reproduce the error with LocalCluster
using more recent Dask versions?
Sure thing. I am using the latest version of the dask chart from helm, and installing it with:
helm install dask dask/dask --set worker.env[0].name=EXTRA_APT_PACKAGES,worker.env[0].value='gcc' --set worker.env[1].name=EXTRA_PIP_PACKAGES,worker.env[1].value='dask-ml numpy==1.19.2 fastparquet pyarrow adlfs xgboost s3fs scikit-learn --upgrade' --set scheduler.env[0].name=EXTRA_PIP_PACKAGES,scheduler.env[0].value='xgboost scikit-learn --upgrade'
I wasn't able to reproduce the error locally, the training completed successfully using the local machine.
Which dask-kubernetes
version are you using?
I am using dask-kubernetes==0.11.0
Hrm I don't think I have enough information. I'm not getting the same error, instead I'm getting some more general dask issues. Could you share your full conda environment?
Hi @jacobtomlinson , I continued to investigate this and I think I have found the issue. The problem appears to occur based on the version of toolz
installed. Here are the envs that I used, for both successful and unsuccessful runs:
name: dask-new
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- argon2-cffi=20.1.0=py38h27cfd23_1
- async_generator=1.10=pyhd3eb1b0_0
- attrs=20.3.0=pyhd3eb1b0_0
- backcall=0.2.0=pyhd3eb1b0_0
- bleach=3.3.0=pyhd3eb1b0_0
- ca-certificates=2021.1.19=h06a4308_0
- certifi=2020.12.5=py38h06a4308_0
- cffi=1.14.5=py38h261ae71_0
- dbus=1.13.18=hb2f20db_0
- decorator=4.4.2=pyhd3eb1b0_0
- defusedxml=0.6.0=pyhd3eb1b0_0
- entrypoints=0.3=py38_0
- expat=2.2.10=he6710b0_2
- fontconfig=2.13.1=h6c09931_0
- freetype=2.10.4=h5ab3b9f_0
- glib=2.67.4=h36276a3_1
- gst-plugins-base=1.14.0=h8213a91_2
- gstreamer=1.14.0=h28cd5cc_2
- icu=58.2=he6710b0_3
- importlib-metadata=2.0.0=py_1
- importlib_metadata=2.0.0=1
- ipykernel=5.3.4=py38h5ca1d4c_0
- ipython=7.21.0=py38hb070fc8_0
- ipython_genutils=0.2.0=pyhd3eb1b0_1
- ipywidgets=7.6.3=pyhd3eb1b0_1
- jedi=0.17.0=py38_0
- jinja2=2.11.3=pyhd3eb1b0_0
- jpeg=9b=h024ee3a_2
- jsonschema=3.2.0=py_2
- jupyter=1.0.0=py38_7
- jupyter_client=6.1.7=py_0
- jupyter_console=6.2.0=py_0
- jupyter_core=4.7.1=py38h06a4308_0
- jupyterlab_pygments=0.1.2=py_0
- jupyterlab_widgets=1.0.0=pyhd3eb1b0_1
- ld_impl_linux-64=2.33.1=h53a641e_7
- libedit=3.1.20191231=h14c3975_1
- libffi=3.3=he6710b0_2
- libgcc-ng=9.1.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libsodium=1.0.18=h7b6447c_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libuuid=1.0.3=h1bed415_2
- libxcb=1.14=h7b6447c_0
- libxml2=2.9.10=hb55368b_3
- markupsafe=1.1.1=py38h7b6447c_0
- mistune=0.8.4=py38h7b6447c_1000
- nbclient=0.5.3=pyhd3eb1b0_0
- nbconvert=6.0.7=py38_0
- nbformat=5.1.2=pyhd3eb1b0_1
- ncurses=6.2=he6710b0_1
- nest-asyncio=1.5.1=pyhd3eb1b0_0
- notebook=6.2.0=py38h06a4308_0
- openssl=1.1.1j=h27cfd23_0
- packaging=20.9=pyhd3eb1b0_0
- pandoc=2.11=hb0f4dca_0
- pandocfilters=1.4.3=py38h06a4308_1
- parso=0.8.1=pyhd3eb1b0_0
- pcre=8.44=he6710b0_0
- pexpect=4.8.0=pyhd3eb1b0_3
- pickleshare=0.7.5=pyhd3eb1b0_1003
- pip=21.0.1=py38h06a4308_0
- prometheus_client=0.9.0=pyhd3eb1b0_0
- prompt-toolkit=3.0.8=py_0
- prompt_toolkit=3.0.8=0
- ptyprocess=0.7.0=pyhd3eb1b0_2
- pycparser=2.20=py_2
- pygments=2.8.0=pyhd3eb1b0_0
- pyparsing=2.4.7=pyhd3eb1b0_0
- pyqt=5.9.2=py38h05f1152_4
- pyrsistent=0.17.3=py38h7b6447c_0
- python=3.8.8=hdb3f193_4
- python-dateutil=2.8.1=pyhd3eb1b0_0
- pyzmq=20.0.0=py38h2531618_1
- qt=5.9.7=h5867ecd_1
- qtconsole=5.0.2=pyhd3eb1b0_0
- qtpy=1.9.0=py_0
- readline=8.1=h27cfd23_0
- send2trash=1.5.0=pyhd3eb1b0_1
- setuptools=52.0.0=py38h06a4308_0
- sip=4.19.13=py38he6710b0_0
- six=1.15.0=py38h06a4308_0
- sqlite=3.33.0=h62c20be_0
- terminado=0.9.2=py38h06a4308_0
- testpath=0.4.4=pyhd3eb1b0_0
- tk=8.6.10=hbc83047_0
- tornado=6.1=py38h27cfd23_0
- traitlets=5.0.5=pyhd3eb1b0_0
- wcwidth=0.2.5=py_0
- webencodings=0.5.1=py38_1
- wheel=0.36.2=pyhd3eb1b0_0
- widgetsnbextension=3.5.1=py38_0
- xz=5.2.5=h7b6447c_0
- zeromq=4.3.3=he6710b0_3
- zipp=3.4.0=pyhd3eb1b0_0
- zlib=1.2.11=h7b6447c_3
- pip:
- aiobotocore==1.2.1
- aiohttp==3.7.4
- aioitertools==0.7.1
- async-timeout==3.0.1
- blosc==1.9.2
- botocore==1.19.52
- cachetools==4.2.1
- chardet==3.0.4
- click==7.1.2
- cloudpickle==1.6.0
- dask==2021.2.0
- dask-glm==0.2.0
- dask-kubernetes==0.11.0
- dask-ml==1.8.0
- distributed==2021.2.0
- fsspec==0.8.7
- google-auth==1.27.0
- heapdict==1.0.1
- idna==2.10
- jmespath==0.10.0
- joblib==1.0.1
- kubernetes==12.0.1
- kubernetes-asyncio==12.0.1
- llvmlite==0.35.0
- locket==0.2.1
- lz4==3.1.1
- msgpack==1.0.2
- multidict==5.1.0
- multipledispatch==0.6.0
- numba==0.52.0
- numpy==1.20.1
- oauthlib==3.1.0
- pandas==1.2.3
- partd==1.1.0
- psutil==5.8.0
- pyarrow==3.0.0
- pyasn1==0.4.8
- pyasn1-modules==0.2.8
- pytz==2021.1
- pyyaml==5.4.1
- requests==2.25.1
- requests-oauthlib==1.3.0
- rsa==4.7.2
- s3fs==0.5.2
- scikit-learn==0.24.1
- scipy==1.6.1
- sortedcontainers==2.3.0
- tblib==1.7.0
- threadpoolctl==2.1.0
- toolz==0.11.1
- typing-extensions==3.7.4.3
- urllib3==1.26.3
- websocket-client==0.58.0
- wrapt==1.12.1
- xgboost==1.3.3
- yarl==1.6.3
- zict==2.0.0
prefix: /anaconda/envs/dask-new
name: base
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- blosc=1.20.1=he1b5a44_0
- bokeh=2.1.1=py38h32f6830_0
- brotlipy=0.7.0=py38h8df0ef7_1001
- ca-certificates=2020.12.5=ha878542_0
- certifi=2020.12.5=py38h578d9bd_1
- cffi=1.14.4=py38ha312104_0
- click=7.1.2=pyh9f0ad1d_0
- cloudpickle=1.6.0=py_0
- conda=4.9.2=py38h578d9bd_0
- conda-package-handling=1.7.2=py38h8df0ef7_0
- cryptography=3.2.1=py38h7699a38_0
- cytoolz=0.11.0=py38h25fe258_1
- freetype=2.10.4=h7ca028e_0
- fsspec=0.8.5=pyhd8ed1ab_0
- heapdict=1.0.1=py_0
- idna=2.10=pyh9f0ad1d_0
- jinja2=2.11.2=pyh9f0ad1d_0
- jpeg=9d=h36c2ea0_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libblas=3.9.0=7_openblas
- libcblas=3.9.0=7_openblas
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.5.0=h14aa051_18
- libgfortran4=7.5.0=h14aa051_18
- liblapack=3.9.0=7_openblas
- libopenblas=0.3.12=pthreads_hb3c22a3_1
- libpng=1.6.37=h21135ba_2
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.0.10=h9022e91_1002
- locket=0.2.0=py_2
- lz4=3.1.1=py38h87b837d_0
- lz4-c=1.9.2=he1b5a44_3
- markupsafe=1.1.1=py38h8df0ef7_2
- msgpack-python=1.0.0=py38h82cb98a_2
- ncurses=6.2=he6710b0_0
- nomkl=1.0=h5ca1d4c_0
- numpy=1.18.1=py38h8854b6b_1
- olefile=0.46=pyh9f0ad1d_1
- openssl=1.1.1h=h516909a_0
- packaging=20.8=pyhd3deb0d_0
- partd=1.1.0=py_0
- pillow=6.2.1=py38h34e0f95_0
- pip=20.3.3=pyhd8ed1ab_0
- psutil=5.7.3=py38h8df0ef7_0
- pycosat=0.6.3=py38h8df0ef7_1005
- pycparser=2.20=pyh9f0ad1d_2
- pyopenssl=20.0.1=pyhd8ed1ab_0
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysocks=1.7.1=py38h578d9bd_3
- python=3.8.0=h0371630_2
- python-blosc=1.9.2=py38h0ef3d22_3
- python-dateutil=2.8.1=py_0
- python_abi=3.8=1_cp38
- pytz=2020.5=pyhd8ed1ab_0
- pyyaml=5.1.2=py38h516909a_0
- readline=7.0=h7b6447c_5
- requests=2.25.1=pyhd3deb0d_0
- ruamel_yaml=0.15.87=py38h7b6447c_0
- setuptools=49.6.0=py38h578d9bd_3
- six=1.15.0=pyh9f0ad1d_0
- sortedcontainers=2.3.0=pyhd8ed1ab_0
- sqlite=3.31.1=h7b6447c_0
- tblib=1.6.0=py_0
- tini=0.18.0=h14c3975_1001
- tk=8.6.8=hbc83047_0
- toolz=0.11.1=py_0
- tornado=6.1=py38h25fe258_0
- tqdm=4.42.1=py_0
- typing_extensions=3.7.4.3=py_0
- urllib3=1.26.2=pyhd8ed1ab_0
- wheel=0.36.2=pyhd3deb0d_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zict=2.0.0=py_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.3=1
- pip:
- aiobotocore==1.2.1
- aiohttp==3.7.4
- aioitertools==0.7.1
- async-timeout==3.0.1
- attrs==20.3.0
- botocore==1.19.52
- chardet==3.0.4
- dask==2021.2.0
- dask-glm==0.2.0
- dask-ml==1.8.0
- distributed==2021.2.0
- fastparquet==0.5.0
- jmespath==0.10.0
- joblib==1.0.1
- llvmlite==0.35.0
- multidict==5.1.0
- multipledispatch==0.6.0
- numba==0.52.0
- pandas==1.2.3
- pyarrow==3.0.0
- s3fs==0.5.2
- scikit-learn==0.24.1
- scipy==1.6.1
- threadpoolctl==2.1.0
- thrift==0.13.0
- wrapt==1.12.1
- xgboost==1.3.3
- yarl==1.6.3
prefix: /opt/conda
name: base
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- blosc=1.20.1=he1b5a44_0
- bokeh=2.1.1=py38h32f6830_0
- brotlipy=0.7.0=py38h8df0ef7_1001
- ca-certificates=2020.12.5=ha878542_0
- certifi=2020.12.5=py38h578d9bd_1
- cffi=1.14.4=py38ha312104_0
- chardet=4.0.0=py38h578d9bd_1
- click=7.1.2=pyh9f0ad1d_0
- cloudpickle=1.6.0=py_0
- conda=4.9.2=py38h578d9bd_0
- conda-package-handling=1.7.2=py38h8df0ef7_0
- cryptography=3.2.1=py38h7699a38_0
- cytoolz=0.11.0=py38h25fe258_1
- freetype=2.10.4=h7ca028e_0
- fsspec=0.8.5=pyhd8ed1ab_0
- heapdict=1.0.1=py_0
- idna=2.10=pyh9f0ad1d_0
- jinja2=2.11.2=pyh9f0ad1d_0
- jpeg=9d=h36c2ea0_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libblas=3.9.0=7_openblas
- libcblas=3.9.0=7_openblas
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.5.0=h14aa051_18
- libgfortran4=7.5.0=h14aa051_18
- liblapack=3.9.0=7_openblas
- libopenblas=0.3.12=pthreads_hb3c22a3_1
- libpng=1.6.37=h21135ba_2
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.0.10=h9022e91_1002
- locket=0.2.0=py_2
- lz4=3.1.1=py38h87b837d_0
- lz4-c=1.9.2=he1b5a44_3
- markupsafe=1.1.1=py38h8df0ef7_2
- msgpack-python=1.0.0=py38h82cb98a_2
- ncurses=6.2=he6710b0_0
- nomkl=1.0=h5ca1d4c_0
- numpy=1.18.1=py38h8854b6b_1
- olefile=0.46=pyh9f0ad1d_1
- openssl=1.1.1h=h516909a_0
- packaging=20.8=pyhd3deb0d_0
- pandas=1.0.1=py38hb3f55d8_0
- partd=1.1.0=py_0
- pillow=6.2.1=py38h34e0f95_0
- pip=20.3.3=pyhd8ed1ab_0
- psutil=5.7.3=py38h8df0ef7_0
- pycosat=0.6.3=py38h8df0ef7_1005
- pycparser=2.20=pyh9f0ad1d_2
- pyopenssl=20.0.1=pyhd8ed1ab_0
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysocks=1.7.1=py38h578d9bd_3
- python=3.8.0=h0371630_2
- python-blosc=1.9.2=py38h0ef3d22_3
- python-dateutil=2.8.1=py_0
- python_abi=3.8=1_cp38
- pytz=2020.5=pyhd8ed1ab_0
- pyyaml=5.1.2=py38h516909a_0
- readline=7.0=h7b6447c_5
- requests=2.25.1=pyhd3deb0d_0
- ruamel_yaml=0.15.87=py38h7b6447c_0
- setuptools=49.6.0=py38h578d9bd_3
- six=1.15.0=pyh9f0ad1d_0
- sortedcontainers=2.3.0=pyhd8ed1ab_0
- sqlite=3.31.1=h7b6447c_0
- tblib=1.6.0=py_0
- tini=0.18.0=h14c3975_1001
- tk=8.6.8=hbc83047_0
- toolz=0.11.1=py_0
- tornado=6.1=py38h25fe258_0
- tqdm=4.42.1=py_0
- typing_extensions=3.7.4.3=py_0
- urllib3=1.26.2=pyhd8ed1ab_0
- wheel=0.36.2=pyhd3deb0d_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zict=2.0.0=py_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.3=1
- pip:
- dask==2021.2.0
- dask-glm==0.2.0
- dask-ml==1.8.0
- distributed==2021.2.0
- joblib==1.0.1
- llvmlite==0.35.0
- multipledispatch==0.6.0
- numba==0.52.0
- scikit-learn==0.24.1
- scipy==1.6.1
- threadpoolctl==2.1.0
- xgboost==1.3.3
name: dask-new
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- argon2-cffi=20.1.0=py38h27cfd23_1
- async_generator=1.10=pyhd3eb1b0_0
- attrs=20.3.0=pyhd3eb1b0_0
- backcall=0.2.0=pyhd3eb1b0_0
- bleach=3.3.0=pyhd3eb1b0_0
- ca-certificates=2021.1.19=h06a4308_0
- certifi=2020.12.5=py38h06a4308_0
- cffi=1.14.5=py38h261ae71_0
- dbus=1.13.18=hb2f20db_0
- decorator=4.4.2=pyhd3eb1b0_0
- defusedxml=0.6.0=pyhd3eb1b0_0
- entrypoints=0.3=py38_0
- expat=2.2.10=he6710b0_2
- fontconfig=2.13.1=h6c09931_0
- freetype=2.10.4=h5ab3b9f_0
- glib=2.67.4=h36276a3_1
- gst-plugins-base=1.14.0=h8213a91_2
- gstreamer=1.14.0=h28cd5cc_2
- icu=58.2=he6710b0_3
- importlib-metadata=2.0.0=py_1
- importlib_metadata=2.0.0=1
- ipykernel=5.3.4=py38h5ca1d4c_0
- ipython=7.21.0=py38hb070fc8_0
- ipython_genutils=0.2.0=pyhd3eb1b0_1
- ipywidgets=7.6.3=pyhd3eb1b0_1
- jedi=0.17.0=py38_0
- jinja2=2.11.3=pyhd3eb1b0_0
- jpeg=9b=h024ee3a_2
- jsonschema=3.2.0=py_2
- jupyter=1.0.0=py38_7
- jupyter_client=6.1.7=py_0
- jupyter_console=6.2.0=py_0
- jupyter_core=4.7.1=py38h06a4308_0
- jupyterlab_pygments=0.1.2=py_0
- jupyterlab_widgets=1.0.0=pyhd3eb1b0_1
- ld_impl_linux-64=2.33.1=h53a641e_7
- libedit=3.1.20191231=h14c3975_1
- libffi=3.3=he6710b0_2
- libgcc-ng=9.1.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libsodium=1.0.18=h7b6447c_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libuuid=1.0.3=h1bed415_2
- libxcb=1.14=h7b6447c_0
- libxml2=2.9.10=hb55368b_3
- markupsafe=1.1.1=py38h7b6447c_0
- mistune=0.8.4=py38h7b6447c_1000
- nbclient=0.5.3=pyhd3eb1b0_0
- nbconvert=6.0.7=py38_0
- nbformat=5.1.2=pyhd3eb1b0_1
- ncurses=6.2=he6710b0_1
- nest-asyncio=1.5.1=pyhd3eb1b0_0
- notebook=6.2.0=py38h06a4308_0
- openssl=1.1.1j=h27cfd23_0
- packaging=20.9=pyhd3eb1b0_0
- pandoc=2.11=hb0f4dca_0
- pandocfilters=1.4.3=py38h06a4308_1
- parso=0.8.1=pyhd3eb1b0_0
- pcre=8.44=he6710b0_0
- pexpect=4.8.0=pyhd3eb1b0_3
- pickleshare=0.7.5=pyhd3eb1b0_1003
- pip=21.0.1=py38h06a4308_0
- prometheus_client=0.9.0=pyhd3eb1b0_0
- prompt-toolkit=3.0.8=py_0
- prompt_toolkit=3.0.8=0
- ptyprocess=0.7.0=pyhd3eb1b0_2
- pycparser=2.20=py_2
- pygments=2.8.0=pyhd3eb1b0_0
- pyparsing=2.4.7=pyhd3eb1b0_0
- pyqt=5.9.2=py38h05f1152_4
- pyrsistent=0.17.3=py38h7b6447c_0
- python=3.8.8=hdb3f193_4
- python-dateutil=2.8.1=pyhd3eb1b0_0
- pyzmq=20.0.0=py38h2531618_1
- qt=5.9.7=h5867ecd_1
- qtconsole=5.0.2=pyhd3eb1b0_0
- qtpy=1.9.0=py_0
- readline=8.1=h27cfd23_0
- send2trash=1.5.0=pyhd3eb1b0_1
- setuptools=52.0.0=py38h06a4308_0
- sip=4.19.13=py38he6710b0_0
- six=1.15.0=py38h06a4308_0
- sqlite=3.33.0=h62c20be_0
- terminado=0.9.2=py38h06a4308_0
- testpath=0.4.4=pyhd3eb1b0_0
- tk=8.6.10=hbc83047_0
- tornado=6.1=py38h27cfd23_0
- traitlets=5.0.5=pyhd3eb1b0_0
- wcwidth=0.2.5=py_0
- webencodings=0.5.1=py38_1
- wheel=0.36.2=pyhd3eb1b0_0
- widgetsnbextension=3.5.1=py38_0
- xz=5.2.5=h7b6447c_0
- zeromq=4.3.3=he6710b0_3
- zipp=3.4.0=pyhd3eb1b0_0
- zlib=1.2.11=h7b6447c_3
- pip:
- aiobotocore==1.2.1
- aiohttp==3.7.4
- aioitertools==0.7.1
- async-timeout==3.0.1
- blosc==1.9.2
- botocore==1.19.52
- cachetools==4.2.1
- chardet==3.0.4
- click==7.1.2
- cloudpickle==1.6.0
- dask==2021.2.0
- dask-glm==0.2.0
- dask-kubernetes==0.11.0
- dask-ml==1.8.0
- distributed==2021.2.0
- fsspec==0.8.7
- google-auth==1.27.0
- heapdict==1.0.1
- idna==2.10
- jmespath==0.10.0
- joblib==1.0.1
- kubernetes==12.0.1
- kubernetes-asyncio==12.0.1
- llvmlite==0.35.0
- locket==0.2.1
- lz4==3.1.1
- msgpack==1.0.2
- multidict==5.1.0
- multipledispatch==0.6.0
- numba==0.52.0
- numpy==1.20.1
- oauthlib==3.1.0
- pandas==1.2.3
- partd==1.1.0
- psutil==5.8.0
- pyarrow==3.0.0
- pyasn1==0.4.8
- pyasn1-modules==0.2.8
- pytz==2021.1
- pyyaml==5.4.1
- requests==2.25.1
- requests-oauthlib==1.3.0
- rsa==4.7.2
- s3fs==0.5.2
- scikit-learn==0.24.1
- scipy==1.6.1
- sortedcontainers==2.3.0
- tblib==1.7.0
- threadpoolctl==2.1.0
- toolz==0.10.0
- typing-extensions==3.7.4.3
- urllib3==1.26.3
- websocket-client==0.58.0
- wrapt==1.12.1
- xgboost==1.3.3
- yarl==1.6.3
- zict==2.0.0
prefix: /anaconda/envs/dask-new
name: base
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- blosc=1.20.1=he1b5a44_0
- bokeh=2.1.1=py38h32f6830_0
- brotlipy=0.7.0=py38h8df0ef7_1001
- ca-certificates=2020.12.5=ha878542_0
- certifi=2020.12.5=py38h578d9bd_1
- cffi=1.14.4=py38ha312104_0
- click=7.1.2=pyh9f0ad1d_0
- cloudpickle=1.6.0=py_0
- conda=4.9.2=py38h578d9bd_0
- conda-package-handling=1.7.2=py38h8df0ef7_0
- cryptography=3.2.1=py38h7699a38_0
- cytoolz=0.11.0=py38h25fe258_1
- freetype=2.10.4=h7ca028e_0
- fsspec=0.8.5=pyhd8ed1ab_0
- heapdict=1.0.1=py_0
- idna=2.10=pyh9f0ad1d_0
- jinja2=2.11.2=pyh9f0ad1d_0
- jpeg=9d=h36c2ea0_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libblas=3.9.0=7_openblas
- libcblas=3.9.0=7_openblas
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.5.0=h14aa051_18
- libgfortran4=7.5.0=h14aa051_18
- liblapack=3.9.0=7_openblas
- libopenblas=0.3.12=pthreads_hb3c22a3_1
- libpng=1.6.37=h21135ba_2
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.0.10=h9022e91_1002
- locket=0.2.0=py_2
- lz4=3.1.1=py38h87b837d_0
- lz4-c=1.9.2=he1b5a44_3
- markupsafe=1.1.1=py38h8df0ef7_2
- msgpack-python=1.0.0=py38h82cb98a_2
- ncurses=6.2=he6710b0_0
- nomkl=1.0=h5ca1d4c_0
- numpy=1.18.1=py38h8854b6b_1
- olefile=0.46=pyh9f0ad1d_1
- openssl=1.1.1h=h516909a_0
- packaging=20.8=pyhd3deb0d_0
- partd=1.1.0=py_0
- pillow=6.2.1=py38h34e0f95_0
- pip=20.3.3=pyhd8ed1ab_0
- psutil=5.7.3=py38h8df0ef7_0
- pycosat=0.6.3=py38h8df0ef7_1005
- pycparser=2.20=pyh9f0ad1d_2
- pyopenssl=20.0.1=pyhd8ed1ab_0
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysocks=1.7.1=py38h578d9bd_3
- python=3.8.0=h0371630_2
- python-blosc=1.9.2=py38h0ef3d22_3
- python-dateutil=2.8.1=py_0
- python_abi=3.8=1_cp38
- pytz=2020.5=pyhd8ed1ab_0
- pyyaml=5.1.2=py38h516909a_0
- readline=7.0=h7b6447c_5
- requests=2.25.1=pyhd3deb0d_0
- ruamel_yaml=0.15.87=py38h7b6447c_0
- setuptools=49.6.0=py38h578d9bd_3
- six=1.15.0=pyh9f0ad1d_0
- sortedcontainers=2.3.0=pyhd8ed1ab_0
- sqlite=3.31.1=h7b6447c_0
- tblib=1.6.0=py_0
- tini=0.18.0=h14c3975_1001
- tk=8.6.8=hbc83047_0
- tornado=6.1=py38h25fe258_0
- tqdm=4.42.1=py_0
- typing_extensions=3.7.4.3=py_0
- urllib3=1.26.2=pyhd8ed1ab_0
- wheel=0.36.2=pyhd3deb0d_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zict=2.0.0=py_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.3=1
- pip:
- aiobotocore==1.2.1
- aiohttp==3.7.4
- aioitertools==0.7.1
- async-timeout==3.0.1
- attrs==20.3.0
- botocore==1.19.52
- chardet==3.0.4
- dask==2021.2.0
- dask-glm==0.2.0
- dask-ml==1.8.0
- distributed==2021.2.0
- fastparquet==0.5.0
- jmespath==0.10.0
- joblib==1.0.1
- llvmlite==0.35.0
- multidict==5.1.0
- multipledispatch==0.6.0
- numba==0.52.0
- pandas==1.2.3
- pyarrow==3.0.0
- s3fs==0.5.2
- scikit-learn==0.24.1
- scipy==1.6.1
- threadpoolctl==2.1.0
- thrift==0.13.0
- toolz==0.10.0
- wrapt==1.12.1
- xgboost==1.3.3
- yarl==1.6.3
name: base
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- blosc=1.20.1=he1b5a44_0
- bokeh=2.1.1=py38h32f6830_0
- brotlipy=0.7.0=py38h8df0ef7_1001
- ca-certificates=2020.12.5=ha878542_0
- certifi=2020.12.5=py38h578d9bd_1
- cffi=1.14.4=py38ha312104_0
- chardet=4.0.0=py38h578d9bd_1
- click=7.1.2=pyh9f0ad1d_0
- cloudpickle=1.6.0=py_0
- conda=4.9.2=py38h578d9bd_0
- conda-package-handling=1.7.2=py38h8df0ef7_0
- cryptography=3.2.1=py38h7699a38_0
- cytoolz=0.11.0=py38h25fe258_1
- freetype=2.10.4=h7ca028e_0
- fsspec=0.8.5=pyhd8ed1ab_0
- heapdict=1.0.1=py_0
- idna=2.10=pyh9f0ad1d_0
- jinja2=2.11.2=pyh9f0ad1d_0
- jpeg=9d=h36c2ea0_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libblas=3.9.0=7_openblas
- libcblas=3.9.0=7_openblas
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.5.0=h14aa051_18
- libgfortran4=7.5.0=h14aa051_18
- liblapack=3.9.0=7_openblas
- libopenblas=0.3.12=pthreads_hb3c22a3_1
- libpng=1.6.37=h21135ba_2
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.0.10=h9022e91_1002
- locket=0.2.0=py_2
- lz4=3.1.1=py38h87b837d_0
- lz4-c=1.9.2=he1b5a44_3
- markupsafe=1.1.1=py38h8df0ef7_2
- msgpack-python=1.0.0=py38h82cb98a_2
- ncurses=6.2=he6710b0_0
- nomkl=1.0=h5ca1d4c_0
- numpy=1.18.1=py38h8854b6b_1
- olefile=0.46=pyh9f0ad1d_1
- openssl=1.1.1h=h516909a_0
- packaging=20.8=pyhd3deb0d_0
- pandas=1.0.1=py38hb3f55d8_0
- partd=1.1.0=py_0
- pillow=6.2.1=py38h34e0f95_0
- pip=20.3.3=pyhd8ed1ab_0
- psutil=5.7.3=py38h8df0ef7_0
- pycosat=0.6.3=py38h8df0ef7_1005
- pycparser=2.20=pyh9f0ad1d_2
- pyopenssl=20.0.1=pyhd8ed1ab_0
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysocks=1.7.1=py38h578d9bd_3
- python=3.8.0=h0371630_2
- python-blosc=1.9.2=py38h0ef3d22_3
- python-dateutil=2.8.1=py_0
- python_abi=3.8=1_cp38
- pytz=2020.5=pyhd8ed1ab_0
- pyyaml=5.1.2=py38h516909a_0
- readline=7.0=h7b6447c_5
- requests=2.25.1=pyhd3deb0d_0
- ruamel_yaml=0.15.87=py38h7b6447c_0
- setuptools=49.6.0=py38h578d9bd_3
- six=1.15.0=pyh9f0ad1d_0
- sortedcontainers=2.3.0=pyhd8ed1ab_0
- sqlite=3.31.1=h7b6447c_0
- tblib=1.6.0=py_0
- tini=0.18.0=h14c3975_1001
- tk=8.6.8=hbc83047_0
- tornado=6.1=py38h25fe258_0
- tqdm=4.42.1=py_0
- typing_extensions=3.7.4.3=py_0
- urllib3=1.26.2=pyhd8ed1ab_0
- wheel=0.36.2=pyhd3deb0d_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zict=2.0.0=py_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.3=1
- pip:
- dask==2021.2.0
- dask-glm==0.2.0
- dask-ml==1.8.0
- distributed==2021.2.0
- joblib==1.0.1
- llvmlite==0.35.0
- multipledispatch==0.6.0
- numba==0.52.0
- scikit-learn==0.24.1
- scipy==1.6.1
- threadpoolctl==2.1.0
- toolz==0.10.0
- xgboost==1.3.3
prefix: /opt/conda
Additionally, it is only successful when these lines are called:
X_train = X_train.persist()
y_train = y_train.persist()
otherwise the error still occurs.
Thanks for the info.
It definitely sounds like this is an environment issue and some versions are not playing nicely together. Additionally I think the fact you have to persist the data could be an issue in XGBoost.
I'm not sure there is anything we can change in this project (dask-kubernetes) to resolve this for you.
What happened: I was following a tutorial available at https://coiled.io/blog/xgboost-frictionless-training/ but keep hitting an error when creating a
xgb.dask.DaskDMatrix
when using aHelmCluster
cluster - there are no issues when running the same code usingLocalCluster
. The cluster is generally working, I can manually scale, and can track tasks using the dashboard.the error I am experiencing is:
I'm not sure if this is due to XGBoost or dask-kubernetes, I decided to post here as it works fine locally.
What you expected to happen:
I would expect the same behaviour as when running on the local cluster.
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment: