rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.15k stars 526 forks source link

[BUG] CUDA error using GLOBAL_QUANTILE for split_algo (experimental RF backend) #3948

Open Oleg-dM opened 3 years ago

Oleg-dM commented 3 years ago

Describe the bug From the rapids documentation example fitting a RandomForestClassifier on synthetic dataset CUDA error occurs when n_rows is set above exactly 4684:

Works fine using split_algo = 0 (HIST) but is 3 times slower..

Simplest working example

import numpy as np
from cuml.ensemble import RandomForestClassifier as cuRFC

n_rows = 4864 # FAILS ABOVE 4864 -> this looks very much like a bug

X = np.random.normal(size=(n_rows,100)).astype(np.float32)
y = np.asarray([0,1]*(n_rows//2), dtype=np.int32)

cuml_model = cuRFC(max_features=35,
                   n_bins=8,
                   n_estimators=400)

%time cuml_model.fit(X,y)

cuml_predict = cuml_model.predict(X)
RuntimeError                              Traceback (most recent call last)
<timed exec> in <module>

~/anaconda3/envs/rapids-0.19/lib/python3.8/site-packages/cuml/internals/api_decorators.py in inner_with_setters(*args, **kwargs)
    407                                 target_val=target_val)
    408 
--> 409                 return func(*args, **kwargs)
    410 
    411         @ wraps(func)

cuml/ensemble/randomforestclassifier.pyx in cuml.ensemble.randomforestclassifier.RandomForestClassifier.fit()

RuntimeError: CUDA error encountered at: file=../src/decisiontree/quantile/**quantile.cuh line=236:** call='cub::**DeviceRadixSort::SortKeys(** (void *)d_temp_storage->data(), temp_storage_bytes, &data[col_offset], single_column_sorted->data(), n_rows, 0, 8 * sizeof(T), stream)', **Reason=cudaErrorInvalidValue:invalid argument**
Obtained 64 stack frames
#0 in /home/oleg/anaconda3/envs/rapids-0.19/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x46) [0x7f45884d5076]
#1 in /home/oleg/anaconda3/envs/rapids-0.19/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft10cuda_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x69) [0x7f45884d57d9]
#2 in /home/oleg/anaconda3/envs/rapids-0.19/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML12DecisionTree16computeQuantilesIfEEvPT_iPKS2_iiSt10shared_ptrIN4raft2mr6device9allocatorEEP11CUstream_st+0x778) [0x7f458889f0d8]

Steps/Code to reproduce bug Running the example linked above by setting n_samples > 4864

Expected behavior model fits using the new backend (split_algo = 1)

Environment details (please complete the following information):

If method of install is [conda], run `conda list` and include results here: ``` # Name Version Build Channel _ipyw_jlab_nb_ext_conf 0.1.0 py38_0 _libgcc_mutex 0.1 main alabaster 0.7.12 pyhd3eb1b0_0 anaconda 2021.05 py38_0 anaconda-client 1.7.2 py38_0 anaconda-navigator 2.0.3 py38_0 anaconda-project 0.9.1 pyhd3eb1b0_1 anyio 2.2.0 py38h06a4308_1 appdirs 1.4.4 py_0 argh 0.26.2 py38_0 argon2-cffi 20.1.0 py38h27cfd23_1 asn1crypto 1.4.0 py_0 astroid 2.5 py38h06a4308_1 astropy 4.2.1 py38h27cfd23_1 async_generator 1.10 pyhd3eb1b0_0 atomicwrites 1.4.0 py_0 attrs 20.3.0 pyhd3eb1b0_0 autopep8 1.5.6 pyhd3eb1b0_0 babel 2.9.0 pyhd3eb1b0_0 backcall 0.2.0 pyhd3eb1b0_0 backports 1.0 pyhd3eb1b0_2 backports.functools_lru_cache 1.6.4 pyhd3eb1b0_0 backports.shutil_get_terminal_size 1.0.0 pyhd3eb1b0_3 backports.tempfile 1.0 pyhd3eb1b0_1 backports.weakref 1.0.post1 py_1 beautifulsoup4 4.9.3 pyha847dfd_0 bitarray 2.1.0 py38h27cfd23_1 bkcharts 0.2 py38_0 black 19.10b0 py_0 blas 1.0 mkl bleach 3.3.0 pyhd3eb1b0_0 blosc 1.21.0 h8c45485_0 bokeh 2.3.2 py38h06a4308_0 boto 2.49.0 py38_0 bottleneck 1.3.2 py38heb32a55_1 brotlipy 0.7.0 py38h27cfd23_1003 bzip2 1.0.8 h7b6447c_0 c-ares 1.17.1 h27cfd23_0 ca-certificates 2021.4.13 h06a4308_1 cairo 1.16.0 hf32fb01_1 certifi 2020.12.5 py38h06a4308_0 cffi 1.14.5 py38h261ae71_0 chardet 4.0.0 py38h06a4308_1003 click 7.1.2 pyhd3eb1b0_0 cloudpickle 1.6.0 py_0 clyent 1.2.2 py38_1 colorama 0.4.4 pyhd3eb1b0_0 conda 4.10.1 py38h06a4308_1 conda-build 3.21.4 py38h06a4308_0 conda-content-trust 0.1.1 pyhd3eb1b0_0 conda-env 2.6.0 1 conda-package-handling 1.7.3 py38h27cfd23_1 conda-repo-cli 1.0.4 pyhd3eb1b0_0 conda-token 0.3.0 pyhd3eb1b0_0 conda-verify 3.4.2 py_1 contextlib2 0.6.0.post1 py_0 cryptography 3.4.7 py38hd23ed53_0 curl 7.71.1 hbc83047_1 cycler 0.10.0 py38_0 cython 0.29.23 py38h2531618_0 cytoolz 0.11.0 py38h7b6447c_0 dask 2021.4.0 pyhd3eb1b0_0 dask-core 2021.4.0 pyhd3eb1b0_0 dbus 1.13.18 hb2f20db_0 decorator 5.0.6 pyhd3eb1b0_0 defusedxml 0.7.1 pyhd3eb1b0_0 diff-match-patch 20200713 py_0 distributed 2021.4.1 py38h06a4308_0 docutils 0.17.1 py38h06a4308_1 entrypoints 0.3 py38_0 et_xmlfile 1.0.1 py_1001 expat 2.3.0 h2531618_2 fastcache 1.1.0 py38h7b6447c_0 filelock 3.0.12 pyhd3eb1b0_1 flake8 3.9.0 pyhd3eb1b0_0 flask 1.1.2 pyhd3eb1b0_0 fontconfig 2.13.1 h6c09931_0 freetype 2.10.4 h5ab3b9f_0 fribidi 1.0.10 h7b6447c_0 fsspec 0.9.0 pyhd3eb1b0_0 future 0.18.2 py38_1 get_terminal_size 1.0.0 haa9412d_0 gevent 21.1.2 py38h27cfd23_1 glib 2.68.1 h36276a3_0 glob2 0.7 pyhd3eb1b0_0 gmp 6.2.1 h2531618_2 gmpy2 2.0.8 py38hd5f6e3b_3 graphite2 1.3.14 h23475e2_0 greenlet 1.0.0 py38h2531618_2 gst-plugins-base 1.14.0 h8213a91_2 gstreamer 1.14.0 h28cd5cc_2 h5py 2.10.0 py38h7918eee_0 harfbuzz 2.8.0 h6f93f22_0 hdf5 1.10.4 hb1b8bf9_0 heapdict 1.0.1 py_0 html5lib 1.1 py_0 icu 58.2 he6710b0_3 idna 2.10 pyhd3eb1b0_0 imageio 2.9.0 pyhd3eb1b0_0 imagesize 1.2.0 pyhd3eb1b0_0 importlib-metadata 3.10.0 py38h06a4308_0 importlib_metadata 3.10.0 hd3eb1b0_0 iniconfig 1.1.1 pyhd3eb1b0_0 intel-openmp 2021.2.0 h06a4308_610 intervaltree 3.1.0 py_0 ipykernel 5.3.4 py38h5ca1d4c_0 ipython 7.22.0 py38hb070fc8_0 ipython_genutils 0.2.0 pyhd3eb1b0_1 ipywidgets 7.6.3 pyhd3eb1b0_1 isort 5.8.0 pyhd3eb1b0_0 itsdangerous 1.1.0 pyhd3eb1b0_0 jbig 2.1 hdba287a_0 jdcal 1.4.1 py_0 jedi 0.17.2 py38h06a4308_1 jeepney 0.6.0 pyhd3eb1b0_0 jinja2 2.11.3 pyhd3eb1b0_0 joblib 1.0.1 pyhd3eb1b0_0 jpeg 9b h024ee3a_2 json5 0.9.5 py_0 jsonschema 3.2.0 py_2 jupyter 1.0.0 py38_7 jupyter-packaging 0.7.12 pyhd3eb1b0_0 jupyter_client 6.1.12 pyhd3eb1b0_0 jupyter_console 6.4.0 pyhd3eb1b0_0 jupyter_core 4.7.1 py38h06a4308_0 jupyter_server 1.4.1 py38h06a4308_0 jupyterlab 3.0.14 pyhd3eb1b0_1 jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 2.4.0 pyhd3eb1b0_0 jupyterlab_widgets 1.0.0 pyhd3eb1b0_1 keyring 22.3.0 py38h06a4308_0 kiwisolver 1.3.1 py38h2531618_0 krb5 1.18.2 h173b8e3_0 lazy-object-proxy 1.6.0 py38h27cfd23_0 lcms2 2.12 h3be6417_0 ld_impl_linux-64 2.33.1 h53a641e_7 libarchive 3.4.2 h62408e4_0 libcurl 7.71.1 h20c2e04_1 libedit 3.1.20210216 h27cfd23_1 libev 4.33 h7b6447c_0 libffi 3.3 he6710b0_2 libgcc-ng 9.1.0 hdf63c60_0 libgfortran-ng 7.3.0 hdf63c60_0 liblief 0.10.1 he6710b0_0 libllvm10 10.0.1 hbcb73fb_5 libpng 1.6.37 hbc83047_0 libsodium 1.0.18 h7b6447c_0 libspatialindex 1.9.3 h2531618_0 libssh2 1.9.0 h1ba5d50_1 libstdcxx-ng 9.1.0 hdf63c60_0 libtiff 4.2.0 h85742a9_0 libtool 2.4.6 h7b6447c_1005 libuuid 1.0.3 h1bed415_2 libuv 1.40.0 h7b6447c_0 libwebp-base 1.2.0 h27cfd23_0 libxcb 1.14 h7b6447c_0 libxml2 2.9.10 hb55368b_3 libxslt 1.1.34 hc22bd24_0 llvmlite 0.36.0 py38h612dafd_4 locket 0.2.1 py38h06a4308_1 lxml 4.6.3 py38h9120a33_0 lz4-c 1.9.3 h2531618_0 lzo 2.10 h7b6447c_2 markupsafe 1.1.1 py38h7b6447c_0 matplotlib 3.3.4 py38h06a4308_0 matplotlib-base 3.3.4 py38h62a2d02_0 mccabe 0.6.1 py38_1 mistune 0.8.4 py38h7b6447c_1000 mkl 2021.2.0 h06a4308_296 mkl-service 2.3.0 py38h27cfd23_1 mkl_fft 1.3.0 py38h42c9631_2 mkl_random 1.2.1 py38ha9443f7_2 mock 4.0.3 pyhd3eb1b0_0 more-itertools 8.7.0 pyhd3eb1b0_0 mpc 1.1.0 h10f8cd9_1 mpfr 4.0.2 hb69a4c5_1 mpmath 1.2.1 py38h06a4308_0 msgpack-python 1.0.2 py38hff7bd54_1 multipledispatch 0.6.0 py38_0 mypy_extensions 0.4.3 py38_0 navigator-updater 0.2.1 py38_0 nbclassic 0.2.6 pyhd3eb1b0_0 nbclient 0.5.3 pyhd3eb1b0_0 nbconvert 6.0.7 py38_0 nbformat 5.1.3 pyhd3eb1b0_0 ncurses 6.2 he6710b0_1 nest-asyncio 1.5.1 pyhd3eb1b0_0 networkx 2.5 py_0 nltk 3.6.1 pyhd3eb1b0_0 nose 1.3.7 pyhd3eb1b0_1006 notebook 6.3.0 py38h06a4308_0 numba 0.53.1 py38ha9443f7_0 numexpr 2.7.3 py38h22e1b3c_1 numpy 1.20.1 py38h93e21f0_0 numpy-base 1.20.1 py38h7d8b39e_0 numpydoc 1.1.0 pyhd3eb1b0_1 olefile 0.46 py_0 openpyxl 3.0.7 pyhd3eb1b0_0 openssl 1.1.1k h27cfd23_0 packaging 20.9 pyhd3eb1b0_0 pandas 1.2.4 py38h2531618_0 pandoc 2.12 h06a4308_0 pandocfilters 1.4.3 py38h06a4308_1 pango 1.45.3 hd140c19_0 parso 0.7.0 py_0 partd 1.2.0 pyhd3eb1b0_0 patchelf 0.12 h2531618_1 path 15.1.2 py38h06a4308_0 path.py 12.5.0 0 pathlib2 2.3.5 py38h06a4308_2 pathspec 0.7.0 py_0 patsy 0.5.1 py38_0 pcre 8.44 he6710b0_0 pep8 1.7.1 py38_0 pexpect 4.8.0 pyhd3eb1b0_3 pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 8.2.0 py38he98fc37_0 pip 21.0.1 py38h06a4308_0 pixman 0.40.0 h7b6447c_0 pkginfo 1.7.0 py38h06a4308_0 pluggy 0.13.1 py38h06a4308_0 ply 3.11 py38_0 prometheus_client 0.10.1 pyhd3eb1b0_0 prompt-toolkit 3.0.17 pyh06a4308_0 prompt_toolkit 3.0.17 hd3eb1b0_0 psutil 5.8.0 py38h27cfd23_1 ptyprocess 0.7.0 pyhd3eb1b0_2 py 1.10.0 pyhd3eb1b0_0 py-lief 0.10.1 py38h403a769_0 pycodestyle 2.6.0 pyhd3eb1b0_0 pycosat 0.6.3 py38h7b6447c_1 pycparser 2.20 py_2 pycurl 7.43.0.6 py38h1ba5d50_0 pydocstyle 6.0.0 pyhd3eb1b0_0 pyerfa 1.7.3 py38h27cfd23_0 pyflakes 2.2.0 pyhd3eb1b0_0 pygments 2.8.1 pyhd3eb1b0_0 pylint 2.7.4 py38h06a4308_1 pyls-black 0.4.6 hd3eb1b0_0 pyls-spyder 0.3.2 pyhd3eb1b0_0 pyodbc 4.0.30 py38he6710b0_0 pyopenssl 20.0.1 pyhd3eb1b0_1 pyparsing 2.4.7 pyhd3eb1b0_0 pyqt 5.9.2 py38h05f1152_4 pyrsistent 0.17.3 py38h7b6447c_0 pysocks 1.7.1 py38h06a4308_0 pytables 3.6.1 py38h9fd0a39_0 pytest 6.2.3 py38h06a4308_2 python 3.8.8 hdb3f193_5 python-dateutil 2.8.1 pyhd3eb1b0_0 python-jsonrpc-server 0.4.0 py_0 python-language-server 0.36.2 pyhd3eb1b0_0 python-libarchive-c 2.9 pyhd3eb1b0_1 pytz 2021.1 pyhd3eb1b0_0 pywavelets 1.1.1 py38h7b6447c_2 pyxdg 0.27 pyhd3eb1b0_0 pyyaml 5.4.1 py38h27cfd23_1 pyzmq 20.0.0 py38h2531618_1 qdarkstyle 2.8.1 py_0 qt 5.9.7 h5867ecd_1 qtawesome 1.0.2 pyhd3eb1b0_0 qtconsole 5.0.3 pyhd3eb1b0_0 qtpy 1.9.0 py_0 readline 8.1 h27cfd23_0 regex 2021.4.4 py38h27cfd23_0 requests 2.25.1 pyhd3eb1b0_0 ripgrep 12.1.1 0 rope 0.18.0 py_0 rtree 0.9.7 py38h06a4308_1 ruamel_yaml 0.15.100 py38h27cfd23_0 scikit-image 0.18.1 py38ha9443f7_0 scikit-learn 0.24.1 py38ha9443f7_0 scipy 1.6.2 py38had2a1c9_1 seaborn 0.11.1 pyhd3eb1b0_0 secretstorage 3.3.1 py38h06a4308_0 send2trash 1.5.0 pyhd3eb1b0_1 setuptools 52.0.0 py38h06a4308_0 simplegeneric 0.8.1 py38_2 singledispatch 3.6.1 pyhd3eb1b0_1001 sip 4.19.13 py38he6710b0_0 six 1.15.0 py38h06a4308_0 sniffio 1.2.0 py38h06a4308_1 snowballstemmer 2.1.0 pyhd3eb1b0_0 sortedcollections 2.1.0 pyhd3eb1b0_0 sortedcontainers 2.3.0 pyhd3eb1b0_0 soupsieve 2.2.1 pyhd3eb1b0_0 sphinx 4.0.1 pyhd3eb1b0_0 sphinxcontrib 1.0 py38_1 sphinxcontrib-applehelp 1.0.2 pyhd3eb1b0_0 sphinxcontrib-devhelp 1.0.2 pyhd3eb1b0_0 sphinxcontrib-htmlhelp 1.0.3 pyhd3eb1b0_0 sphinxcontrib-jsmath 1.0.1 pyhd3eb1b0_0 sphinxcontrib-qthelp 1.0.3 pyhd3eb1b0_0 sphinxcontrib-serializinghtml 1.1.4 pyhd3eb1b0_0 sphinxcontrib-websupport 1.2.4 py_0 spyder 4.2.5 py38h06a4308_0 spyder-kernels 1.10.2 py38h06a4308_0 sqlalchemy 1.4.15 py38h27cfd23_0 sqlite 3.35.4 hdfb4753_0 statsmodels 0.12.2 py38h27cfd23_0 sympy 1.8 py38h06a4308_0 tbb 2020.3 hfd86e86_0 tblib 1.7.0 py_0 terminado 0.9.4 py38h06a4308_0 testpath 0.4.4 pyhd3eb1b0_0 textdistance 4.2.1 pyhd3eb1b0_0 threadpoolctl 2.1.0 pyh5ca1d4c_0 three-merge 0.1.1 pyhd3eb1b0_0 tifffile 2020.10.1 py38hdd07704_2 tk 8.6.10 hbc83047_0 toml 0.10.2 pyhd3eb1b0_0 toolz 0.11.1 pyhd3eb1b0_0 tornado 6.1 py38h27cfd23_0 tqdm 4.59.0 pyhd3eb1b0_1 traitlets 5.0.5 pyhd3eb1b0_0 typed-ast 1.4.2 py38h27cfd23_1 typing_extensions 3.7.4.3 pyha847dfd_0 ujson 4.0.2 py38h2531618_0 unicodecsv 0.14.1 py38_0 unixodbc 2.3.9 h7b6447c_0 urllib3 1.26.4 pyhd3eb1b0_0 watchdog 1.0.2 py38h06a4308_1 wcwidth 0.2.5 py_0 webencodings 0.5.1 py38_1 werkzeug 1.0.1 pyhd3eb1b0_0 wheel 0.36.2 pyhd3eb1b0_0 widgetsnbextension 3.5.1 py38_0 wrapt 1.12.1 py38h7b6447c_1 wurlitzer 2.1.0 py38h06a4308_0 xlrd 2.0.1 pyhd3eb1b0_0 xlsxwriter 1.3.8 pyhd3eb1b0_0 xlwt 1.3.0 py38_0 xmltodict 0.12.0 py_0 xz 5.2.5 h7b6447c_0 yaml 0.2.5 h7b6447c_0 yapf 0.31.0 pyhd3eb1b0_0 zeromq 4.3.4 h2531618_0 zict 2.0.0 pyhd3eb1b0_0 zipp 3.4.1 pyhd3eb1b0_0 zlib 1.2.11 h7b6447c_3 zope 1.0 py38_1 zope.event 4.5.0 py38_0 zope.interface 5.3.0 py38h27cfd23_0 zstd 1.4.5 h9ceee32_0 ```

Installation procedure:

  1. Fresh Ubuntu 20.04 install
  2. Blacklist nouveau drivers
  3. sudo sh cuda_11.2.0_460.27.04_linux.run
  4. bash Anaconda3-2021.05-Linux-x86_64.sh
  5. conda create -n rapids-0.19 -c rapidsai -c nvidia -c conda-forge \ rapids-blazing=0.19 python=3.8 cudatoolkit=11.2
Nanthini10 commented 3 years ago

Is it working on collab with n_samples > 6846?

I'm unable to reproduce this with 21.06. Can you try using the latest version of RAPIDS and seeing if the error still persists?

You can install the either from source or using docker as follows docker pull rapidsai/rapidsai-core-dev-nightly:21.06-cuda11.2-devel-ubuntu18.04-py3.8

Oleg-dM commented 3 years ago

Is it working on collab with n_samples > 6846?

I'm unable to reproduce this with 21.06. Can you try using the latest version of RAPIDS and seeing if the error still persists?

You can install the either from source or using docker as follows docker pull rapidsai/rapidsai-core-dev-nightly:21.06-cuda11.2-devel-ubuntu18.04-py3.8

Thanks for the quick answer - tried the docker image and unfortunately still got the same issue (and the RF fitting is much slower than with 0.19)

any other suggestions? I'm kinda losing hope ..

hcho3 commented 3 years ago

@Oleg-dM It appears that the issue is specific to your desktop. I tried setting up RAPIDS fresh on an AWS EC2 virtual machine and the example ran successfully with n_samples=6486. Here is how I set it up.

  1. Create a new EC2 instance with type g4dn.2xlarge.
  2. Install CUDA 11.2 by following directions in https://developer.nvidia.com/cuda-11.2.2-download-archive.
  3. Install Miniconda from https://docs.conda.io/en/latest/miniconda.html
  4. Set up Conda environment with the RAPIDS package by running
    conda create -n rapids -c rapidsai-nightly -c nvidia -c conda-forge rapids=21.06 python=3.8 cudatoolkit=11.2
Oleg-dM commented 3 years ago

Issue identified: error occurs using the experimental backend but works well using split_algo = 0 (HIST) which relies on the default backend.

As recall: error occurs in file quantile.cuh line=236 when calling CUDA function DeviceRadixSort::SortKeys (see original post for details)

@hcho3 @Nanthini10

Also, I tested the documentation example on 2 distinct machines with same fresh from scratch install of ubuntu 20, cuda and conda and error persisted:

hcho3 commented 3 years ago

I just tried running the script using my workstation (Quadro RTX 8000, CUDA 11.0) and could not reproduce the error.

Maybe the error is specific to older generations of graphics cards?

Oleg-dM commented 3 years ago

Thank you @hcho3, do you know who could look into that specific issue? Should we put this into a backlog somehow?

May be @canonizer ?

dumerrill commented 3 years ago

FWIW, usually an cudaErrorInvalidValue error when reported by a call to CUB (or Thrust) is just "coughing up" a latent CUDA Runtime errno left over from some previous operation (e.g., a bad cudaMemcpy()), and has nothing to do with the sort itself.

Oleg-dM commented 3 years ago

Thanks a lot Duane for jumping in. Are RAPIDS guys understaffed? They don't seem to bother about simplest stuffs, feels more and more like a marketing library to make you buy expensive nvidia GPUs..

Have a good week

On Fri, 18 Jun 2021 at 21:43, Duane Merrill @.***> wrote:

FWIW, usually an cudaErrorInvalidValue error when reported by a call to CUB (or Thrust) is just "coughing up" a latent CUDA Runtime errno left over from some previous operation (e.g., a bad cudaMemcpy()), and has nothing to do with the sort itself.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rapidsai/cuml/issues/3948#issuecomment-864239687, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB73245YUDTROZX34SMCJDTTTOOWBANCNFSM46HPGQCA .

-- Oleg Del Maschio

hcho3 commented 3 years ago

@Oleg-dM We apologize for the delay. We will follow up on this issue as soon as we can.

Oleg-dM commented 3 years ago

@Oleg-dM We apologize for the delay. We will follow up on this issue as soon as we can.

Thank you Philip - do you have an idea of a timeline ? Days or weeks ?

vinaydes commented 3 years ago

Hi @Oleg-dM, I had access to a sm_61 device thus I tried to debug the issue. Here are my observations:

  1. The issue is specific to sm_61 devices. Both your GPUs are sm_61 thats why you see this issue.
  2. Workaround The issue appears only when you install pre-built libcuml from conda channel. If you build from source the issue goes away. Building from source is not super complicated either. All it takes is creating conda environment and invoking ./build.sh. You can find more here https://github.com/rapidsai/cuml/blob/branch-21.08/BUILD.md.
  3. I am currently not sure what is the reason for such a difference between pre-built vs built from source. A key difference between pre-built and built from source is regarding which cuda PTX objects are present in the libcuml. Pre-built has PTX for sm_60 which should work for sm_61, so it should not really matter. However when I built from source for sm_60 (just like pre-built binary) the issue started appearing again. More investigation needed to refine the root cause further.
  4. @dumerrill To eliminate stale errors I added CUDA_CHECK(cudaDeviceSynchronize()) just before line quantile.cuh#L75l. The error was still with the cub::DeviceRadixSort::SortKeys function. Inside the function, kernel cub::DeviceRadixSortDownsweepKernel seems to throw the error at launch.
  5. I could reproduce the error with C++ benchmarking code from cuML, which speeds up the process of debugging. However unlike Python example which fails every time, C++ one fails intermittently.

In short: To @Oleg-dM or anyone else getting affected by this issue could use the workaround described above, while we continue to debug the issue further.

Oleg-dM commented 3 years ago

Amazing thank you Vinay - will give it a try asap

update: @vinaydes I keep running into the error "#error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this." Do you know where to define this flag? Wasn't able to - tried the build.sh way and the manual one but no luck.

vinaydes commented 3 years ago

@Oleg-dM It is probably not a good idea to ignore this error. Just to confirm, are these the steps you followed for building the code:

git clone https://github.com/rapidsai/cuml.git
cd cuml
git checkout branch-21.06 # last stable
conda env create --name cuml-dev-11.2 --file conda/environments/cuml_dev_cuda11.2.yml
conda activate cuml-dev-11.2
./build.sh clean
./build.sh

If yes, then can you share the list of packages installed in the environment? You can get the list by activating the environment and then executing conda list.

Oleg-dM commented 3 years ago

@Oleg-dM It is probably not a good idea to ignore this error. Just to confirm, are these the steps you followed for building the code:

git clone https://github.com/rapidsai/cuml.git
cd cuml
git checkout branch-21.06 # last stable
conda env create --name cuml-dev-11.2 --file conda/environments/cuml_dev_cuda11.2.yml
conda activate cuml-dev-11.2
./build.sh clean
./build.sh

If yes, then can you share the list of packages installed in the environment? You can get the list by activating the environment and then executing conda list.

Did follow these exact instruction (except that I downloaded 21.06 sources zip and unzipped manually) and ran into the same error as described above - the compilation starts with the below error (and packages are listed below the error).

Any idea of were the incompatibility could come from? CUB 1.11 should be ok with cuda 11.2 ?

-- Configuring done -- Generating done -- Build files have been written to: /home/oleg/Downloads/cuml-branch-21.06/cpp/build [1/226] Building CUDA object CMakeFiles/cuml++.dir/src/fil/infer.cu.o FAILED: CMakeFiles/cuml++.dir/src/fil/infer.cu.o /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DCUML_CPP_API -DDISABLE_CUSPARSE_DEPRECATED -DDMLC_CORE_USE_CMAKE -DDMLC_USE_CXX11=1 -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DUSEXOPEN2K8 -DcumlEXPORTS -I../include -I../src -I../src_prims -I/include -I/home/oleg/anaconda3/envs/rapids/include -I_deps/thrust-src -I_deps/thrust-src/dependencies/cub -I_deps/raft-src/cpp/include -isystem=/home/oleg/anaconda3/envs/cuml_dev/include -isystem=/usr/local/cuda/include -isystem=/home/oleg/anaconda3/envs/cuml_dev/include/cumlprims -O3 -DNDEBUG --generate-code=arch=compute_61,code=[sm_61] -Xcompiler=-fPIC --expt-extended-lambda --expt-relaxed-constexpr -Xcompiler=-Wno-deprecated-declarations -Xcompiler=-fopenmp -std=c++17 -MD -MT CMakeFiles/cuml++.dir/src/fil/infer.cu.o -MF CMakeFiles/cuml++.dir/src/fil/infer.cu.o.d -x cu -c ../src/fil/infer.cu -o CMakeFiles/cuml++.dir/src/fil/infer.cu.o In file included from /home/oleg/anaconda3/envs/rapids/include/thrust/system/cuda/detail/execution_policy.h:33, from /home/oleg/anaconda3/envs/rapids/include/thrust/iterator/detail/device_system_tag.h:23, from /home/oleg/anaconda3/envs/rapids/include/thrust/iterator/iterator_traits.h:111, from /home/oleg/anaconda3/envs/rapids/include/thrust/detail/type_traits/pointer_traits.h:23, from /home/oleg/anaconda3/envs/rapids/include/thrust/detail/raw_pointer_cast.h:20, from /home/oleg/anaconda3/envs/rapids/include/thrust/detail/raw_reference_cast.h:20, from /home/oleg/anaconda3/envs/rapids/include/thrust/detail/functional/actor.h:33, from /home/oleg/anaconda3/envs/rapids/include/thrust/detail/functional/placeholder.h:20, from /home/oleg/anaconda3/envs/rapids/include/thrust/functional.h:26, from ../src/fil/infer.cu:20: /home/oleg/anaconda3/envs/rapids/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this. 78 | #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this. | ^~~~~

conda list output below:

click here ``` (cuml_dev) oleg@oleg-B550M-DS3H:~/Downloads/cuml-branch-21.06$ conda list packages in environment at /home/oleg/anaconda3/envs/cuml_dev: Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge abseil-cpp 20210324.1 h9c3ff4c_0 conda-forge aiobotocore 1.3.1 pyhd8ed1ab_0 conda-forge aiohttp 3.7.4.post0 py38h497a2fe_0 conda-forge aioitertools 0.7.1 pyhd8ed1ab_0 conda-forge alabaster 0.7.12 py_0 conda-forge anyio 3.2.0 py38h578d9bd_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge arrow-cpp 1.0.1 py38h40c9144_40_cuda conda-forge arrow-cpp-proc 3.0.0 cuda conda-forge asn1crypto 1.4.0 pyh9f0ad1d_0 conda-forge asvdb 0.4.1 gd6cd8f2_36 rapidsai async-timeout 3.0.1 py_1000 conda-forge async_generator 1.10 py_0 conda-forge atk-1.0 2.36.0 h3371d22_4 conda-forge attrs 21.2.0 pyhd8ed1ab_0 conda-forge autoconf 2.69 pl5320h36c2ea0_10 conda-forge automake 1.16.2 pl5320ha770c72_3 conda-forge aws-c-cal 0.5.11 h95a6274_0 conda-forge aws-c-common 0.6.2 h7f98852_0 conda-forge aws-c-event-stream 0.2.7 h3541f99_13 conda-forge aws-c-io 0.10.5 hfb6a706_0 conda-forge aws-checksums 0.1.11 ha31a3da_7 conda-forge aws-sam-translator 1.36.0 pyhd8ed1ab_0 conda-forge aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge aws-xray-sdk 2.8.0 pyhd8ed1ab_0 conda-forge babel 2.9.1 pyh44b312d_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge beautifulsoup4 4.9.3 pyhb0f4dca_0 conda-forge benchmark 1.5.1 he1b5a44_2 conda-forge black 19.10b0 py_4 conda-forge blas 2.105 netlib conda-forge blas-devel 3.9.0 5_netlib conda-forge bleach 3.3.0 pyh44b312d_0 conda-forge blinker 1.4 py_1 conda-forge blosc 1.21.0 h9c3ff4c_0 conda-forge bokeh 2.2.3 py38h578d9bd_0 conda-forge boost 1.72.0 py38h1e42940_1 conda-forge boost-cpp 1.72.0 h9d3c048_4 conda-forge boto3 1.17.49 pyhd8ed1ab_0 conda-forge botocore 1.20.49 pyhd8ed1ab_0 conda-forge brotli 1.0.9 h9c3ff4c_4 conda-forge brotlipy 0.7.0 py38h497a2fe_1001 conda-forge brunsli 0.1 h9c3ff4c_0 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.17.1 h7f98852_1 conda-forge ca-certificates 2021.5.30 ha878542_0 conda-forge cachetools 4.2.2 pyhd8ed1ab_0 conda-forge cairo 1.16.0 h6cf1ce9_1008 conda-forge certifi 2021.5.30 py38h578d9bd_0 conda-forge cffi 1.14.5 py38ha65f79e_0 conda-forge cfitsio 3.470 hb418390_7 conda-forge cfn-lint 0.51.0 py38h578d9bd_0 conda-forge chardet 4.0.0 py38h578d9bd_1 conda-forge charls 2.2.0 h9c3ff4c_0 conda-forge clang 8.0.1 hc9558a2_2 conda-forge clang-tools 8.0.1 hc9558a2_2 conda-forge clangxx 8.0.1 2 conda-forge click 7.1.2 pyh9f0ad1d_0 conda-forge click-plugins 1.1.1 py_0 conda-forge cligj 0.7.2 pyhd8ed1ab_0 conda-forge cloudpickle 1.6.0 py_0 conda-forge cmake 3.20.2 h541d2ed_0 conda-forge cmake-format 0.6.11 pyh9f0ad1d_0 conda-forge cmake_setuptools 0.1.3 py_0 rapidsai cmarkgfm 0.5.3 py38h497a2fe_0 conda-forge colorama 0.4.4 pyh9f0ad1d_0 conda-forge colorcet 2.0.6 pyhd8ed1ab_0 conda-forge commonmark 0.9.1 py_0 conda-forge conda 4.8.3 py38h32f6830_2 conda-forge conda-build 3.20.3 py38h32f6830_0 conda-forge conda-package-handling 1.7.3 py38h497a2fe_0 conda-forge conda-verify 3.1.1 py38h578d9bd_1003 conda-forge coverage 5.5 py38h497a2fe_0 conda-forge cryptography 3.4.7 py38ha5dfef3_0 conda-forge cub 1.11.0 ha770c72_1 conda-forge cudatoolkit 11.2.72 h2bc3f7f_0 nvidia cudf 21.06.01 cuda_11.2_py38_g101fc0fda4_2 rapidsai cupy 9.1.0 py38ha69542f_0 conda-forge curl 7.77.0 hea6ffbf_0 conda-forge cycler 0.10.0 py_2 conda-forge cyrus-sasl 2.1.27 h230043b_2 conda-forge cython 0.29.23 py38h709712a_1 conda-forge cytoolz 0.11.0 py38h497a2fe_3 conda-forge dask 2021.5.1 pypi_0 pypi dask-cuda 21.06.00 py38_0 rapidsai dask-cudf 21.06.01 py38_g101fc0fda4_2 rapidsai dask-glm 0.2.0 py_1 conda-forge dask-labextension 5.0.2 pyhd8ed1ab_0 conda-forge dask-ml 1.9.0 pyhd8ed1ab_0 conda-forge dataclasses 0.8 pyhc8e2a94_1 conda-forge datashader 0.11.1 pyh9f0ad1d_0 conda-forge datashape 0.5.4 py_1 conda-forge dbus 1.13.6 h48d8840_2 conda-forge decorator 4.4.2 py_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distributed 2021.5.1 pypi_0 pypi dlpack 0.5 h9c3ff4c_0 conda-forge docker-py 5.0.0 py38h578d9bd_0 conda-forge docker-pycreds 0.4.0 py_0 conda-forge docutils 0.17.1 py38h578d9bd_0 conda-forge double-conversion 3.1.5 h9c3ff4c_2 conda-forge doxygen 1.8.20 had0d8f1_0 conda-forge ecdsa 0.17.0 pyhd8ed1ab_0 conda-forge entrypoints 0.3 pyhd8ed1ab_1003 conda-forge execnet 1.9.0 pyhd8ed1ab_0 conda-forge expat 2.4.1 h9c3ff4c_0 conda-forge fa2 0.3.5 py38h1e0a361_0 conda-forge faiss-proc 1.0.0 cuda rapidsai fastavro 1.4.1 py38h497a2fe_0 conda-forge fastrlock 0.6 py38h709712a_0 conda-forge feather-format 0.4.1 pyh9f0ad1d_0 conda-forge filelock 3.0.12 pyh9f0ad1d_0 conda-forge filterpy 1.4.5 py_1 conda-forge fiona 1.8.20 py38hdb5a769_0 conda-forge flake8 3.8.4 py_0 conda-forge flask 2.0.1 pyhd8ed1ab_0 conda-forge flask_cors 3.0.10 pyhd3deb0d_0 conda-forge flatbuffers 1.10.0 hf484d3e_1002 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.13.1 hba837de_1005 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge freetype 2.10.4 h0708190_1 conda-forge freexl 1.0.6 h7f98852_0 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge fsspec 2021.6.0 pyhd8ed1ab_0 conda-forge future 0.18.2 py38h578d9bd_3 conda-forge gcsfs 2021.6.0 pyhd8ed1ab_0 conda-forge gdal 3.2.2 py38h507a4fd_5 conda-forge gdk-pixbuf 2.42.6 h04a7f16_0 conda-forge geopandas 0.9.0 pyhd8ed1ab_1 conda-forge geopandas-base 0.9.0 pyhd8ed1ab_1 conda-forge geos 3.9.1 h9c3ff4c_2 conda-forge geotiff 1.6.0 hcf90da6_5 conda-forge gettext 0.19.8.1 h0b5b191_1005 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge giflib 5.2.1 h36c2ea0_2 conda-forge git 2.30.2 pl5320h24fefe6_1 conda-forge glib 2.68.3 h9c3ff4c_0 conda-forge glib-tools 2.68.3 h9c3ff4c_0 conda-forge glob2 0.7 py_0 conda-forge glog 0.5.0 h48cff8f_0 conda-forge gmock 1.10.0 h4bd325d_7 conda-forge gmp 6.2.1 h58526e2_0 conda-forge google-auth 1.30.2 pyh6c4a22f_0 conda-forge google-auth-oauthlib 0.4.4 pyhd8ed1ab_0 conda-forge graphite2 1.3.13 h58526e2_1001 conda-forge graphviz 2.47.3 h85b4f2f_0 conda-forge grpc-cpp 1.38.0 h2519f57_0 conda-forge gtest 1.10.0 h4bd325d_7 conda-forge gtk2 2.24.33 h539f30e_1 conda-forge gts 0.7.6 h64030ff_2 conda-forge harfbuzz 2.8.1 h83ec7ef_0 conda-forge hdbscan 0.8.27 py38h5c078b8_0 conda-forge hdf4 4.2.15 h10796ff_3 conda-forge hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge heapdict 1.0.1 py_0 conda-forge holoviews 1.14.4 pyhd8ed1ab_0 conda-forge httpretty 1.1.3 pyhd8ed1ab_0 conda-forge huggingface_hub 0.0.12 pyhd8ed1ab_0 conda-forge hypothesis 6.14.0 pyhd8ed1ab_0 conda-forge icu 68.1 h58526e2_0 conda-forge idna 2.10 pyh9f0ad1d_0 conda-forge imagecodecs 2021.3.31 py38h1455ab2_0 conda-forge imageio 2.9.0 py_0 conda-forge imagesize 1.2.0 py_0 conda-forge importlib-metadata 4.5.0 py38h578d9bd_0 conda-forge importlib_metadata 4.5.0 hd8ed1ab_0 conda-forge inflection 0.5.1 pyh9f0ad1d_0 conda-forge iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge ipykernel 5.5.5 py38hd0cf306_0 conda-forge ipython 7.15.0 py38h32f6830_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 7.6.3 pyhd3deb0d_0 conda-forge isort 5.0.9 py38h32f6830_0 conda-forge itsdangerous 2.0.1 pyhd8ed1ab_0 conda-forge jedi 0.17.2 py38h578d9bd_1 conda-forge jeepney 0.6.0 pyhd8ed1ab_0 conda-forge jinja2 3.0.1 pyhd8ed1ab_0 conda-forge jmespath 0.10.0 pyh9f0ad1d_0 conda-forge joblib 1.0.1 pyhd8ed1ab_0 conda-forge jpeg 9d h36c2ea0_0 conda-forge json-c 0.15 h98cffda_0 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsondiff 1.1.2 py_0 conda-forge jsonpatch 1.24 py_0 conda-forge jsonpointer 2.0 py_0 conda-forge jsonschema 3.2.0 pyhd8ed1ab_3 conda-forge junit-xml 1.9 pyh9f0ad1d_0 conda-forge jupyter-packaging 0.7.12 pyhd8ed1ab_0 conda-forge jupyter-server-proxy 3.0.2 pyhd8ed1ab_0 conda-forge jupyter_client 6.1.12 pyhd8ed1ab_0 conda-forge jupyter_core 4.7.1 py38h578d9bd_0 conda-forge jupyter_server 1.8.0 pyhd8ed1ab_0 conda-forge jupyter_sphinx 0.3.1 py38h578d9bd_1 conda-forge jupyterlab 3.0.16 pyhd8ed1ab_0 conda-forge jupyterlab-nvdashboard 0.6.0 py_0 rapidsai jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_server 2.6.0 pyhd8ed1ab_0 conda-forge jupyterlab_widgets 1.0.0 pyhd8ed1ab_1 conda-forge jxrlib 1.1 h7f98852_2 conda-forge kealib 1.4.14 hcc255d8_2 conda-forge keyring 23.0.1 py38h578d9bd_0 conda-forge kiwisolver 1.3.1 py38h1fd1430_1 conda-forge krb5 1.19.1 hcc1bbae_0 conda-forge lapack 3.9.0 netlib conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge lerc 2.2.1 h9c3ff4c_0 conda-forge libaec 1.0.5 h9c3ff4c_0 conda-forge libarchive 3.5.1 h3f442fb_1 conda-forge libblas 3.9.0 5_h92ddd45_netlib conda-forge libcblas 3.9.0 5_h92ddd45_netlib conda-forge libcudf 21.06.01 cuda11.2_g101fc0fda4_2 rapidsai libcumlprims 21.06.00 cuda11.2_gfda2e6c_0 nvidia libcurl 7.77.0 h2574ce0_0 conda-forge libcypher-parser 0.6.2 1 rapidsai libdap4 3.20.6 hd7c4107_2 conda-forge libdeflate 1.7 h7f98852_5 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 hcdb4288_3 conda-forge libfaiss 1.7.0 cuda112h5bea7ad_8_cuda conda-forge libffi 3.3 h58526e2_2 conda-forge libgcc-ng 9.3.0 h2828fa1_19 conda-forge libgcrypt 1.9.3 h7f98852_1 conda-forge libgd 2.3.2 h78a0170_0 conda-forge libgdal 3.2.2 h679344c_5 conda-forge libgfortran-ng 9.3.0 hff62375_19 conda-forge libgfortran5 9.3.0 hff62375_19 conda-forge libglib 2.68.3 h3e27bee_0 conda-forge libgomp 9.3.0 h2828fa1_19 conda-forge libgpg-error 1.42 h9c3ff4c_0 conda-forge libgsasl 1.8.0 2 conda-forge libhwloc 2.3.0 h5e5b7d1_1 conda-forge libiconv 1.16 h516909a_0 conda-forge libkml 1.3.0 hd79254b_1012 conda-forge liblapack 3.9.0 5_h92ddd45_netlib conda-forge liblapacke 3.9.0 5_h92ddd45_netlib conda-forge liblief 0.11.5 h9c3ff4c_0 conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libllvm8 8.0.1 hc9558a2_0 conda-forge libnetcdf 4.8.0 nompi_hcd642e3_103 conda-forge libnghttp2 1.43.0 h812cca2_0 conda-forge libntlm 1.4 h7f98852_1002 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libpq 13.3 hd57d9b9_0 conda-forge libprotobuf 3.16.0 h780b84a_0 conda-forge librdkafka 1.5.3 hc49e61c_1 conda-forge librmm 21.06.00 cuda11.2_gee432a0_0 rapidsai librsvg 2.50.7 hc3c00ef_0 conda-forge librttopo 1.1.0 h1185371_6 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libspatialindex 1.9.3 h9c3ff4c_3 conda-forge libspatialite 5.0.1 h20cb978_4 conda-forge libssh2 1.9.0 ha56f1ee_6 conda-forge libstdcxx-ng 9.3.0 h6de172a_19 conda-forge libthrift 0.14.1 he6d91bd_2 conda-forge libtiff 4.2.0 hbd63e13_2 conda-forge libtmglib 3.9.0 5_h92ddd45_netlib conda-forge libtool 2.4.6 h58526e2_1007 conda-forge libutf8proc 2.6.1 h7f98852_0 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libuv 1.41.0 h7f98852_0 conda-forge libwebp 1.2.0 h3452ae3_0 conda-forge libwebp-base 1.2.0 h7f98852_2 conda-forge libxcb 1.13 h7f98852_1003 conda-forge libxml2 2.9.12 h72842e0_0 conda-forge libxslt 1.1.33 h15afd5d_2 conda-forge libzip 1.8.0 h4de3113_0 conda-forge libzopfli 1.0.3 h9c3ff4c_0 conda-forge lightgbm 3.2.1 py38h709712a_0 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lxml 4.6.3 py38hf1fe3a4_0 conda-forge lz4-c 1.9.3 h9c3ff4c_0 conda-forge lzo 2.10 h516909a_1000 conda-forge m4 1.4.18 h516909a_1001 conda-forge make 4.3 hd18ef5c_1 conda-forge mapclassify 2.4.2 pyhd8ed1ab_0 conda-forge markdown 3.3.4 pyhd8ed1ab_0 conda-forge markupsafe 2.0.1 py38h497a2fe_0 conda-forge matplotlib-base 3.4.2 py38hcc49a3a_0 conda-forge mccabe 0.6.1 py_1 conda-forge mimesis 4.0.0 pyh9f0ad1d_0 conda-forge mistune 0.8.4 py38h497a2fe_1003 conda-forge mock 4.0.3 py38h578d9bd_1 conda-forge more-itertools 8.8.0 pyhd8ed1ab_0 conda-forge moto 2.0.7 pyhd8ed1ab_0 conda-forge msgpack-python 1.0.2 py38h1fd1430_1 conda-forge multidict 5.1.0 py38h497a2fe_1 conda-forge multipledispatch 0.6.0 py_0 conda-forge munch 2.5.0 py_0 conda-forge mypy 0.782 py_0 conda-forge mypy_extensions 0.4.3 py38h578d9bd_3 conda-forge nbclassic 0.3.1 pyhd8ed1ab_1 conda-forge nbclient 0.5.3 pyhd8ed1ab_0 conda-forge nbconvert 6.1.0 py38h578d9bd_0 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge nbsphinx 0.8.6 pyhd8ed1ab_1 conda-forge nccl 2.9.9.1 hdc17891_0 conda-forge ncurses 6.2 h58526e2_4 conda-forge nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge networkx 2.5.1 pyhd8ed1ab_0 conda-forge ninja 1.10.2 h4bd325d_0 conda-forge nltk 3.6.2 pyhd8ed1ab_0 conda-forge nodejs 14.15.4 h92b4a50_1 conda-forge notebook 6.4.0 pyha770c72_0 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numpy 1.21.0 py38h9894fe3_0 conda-forge numpydoc 1.1.0 py_1 conda-forge nvtx 0.2.3 py38h497a2fe_0 conda-forge oauthlib 3.1.1 pyhd8ed1ab_0 conda-forge olefile 0.46 pyh9f0ad1d_1 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openslide 3.4.1 h978ee9a_3 conda-forge openssl 1.1.1k h7f98852_0 conda-forge orc 1.6.7 h89a63ab_2 conda-forge packaging 20.9 pyh44b312d_0 conda-forge pandas 1.2.5 py38h1abd341_0 conda-forge pandoc 1.19.2 0 conda-forge pandocfilters 1.4.2 py_1 conda-forge panel 0.10.3 pyhd8ed1ab_0 conda-forge pango 1.48.5 hb8ff022_0 conda-forge param 1.10.1 pyhd3deb0d_0 conda-forge parquet-cpp 1.5.1 2 conda-forge parso 0.7.1 pyh9f0ad1d_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge patchelf 0.11 he1b5a44_0 conda-forge pathspec 0.8.1 pyhd3deb0d_0 conda-forge patsy 0.5.1 py_0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge perl 5.32.1 0_h7f98852_perl5 conda-forge pexpect 4.8.0 pyh9f0ad1d_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 8.2.0 py38ha0e1e83_1 conda-forge pip 21.1.2 pyhd8ed1ab_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge pkg-config 0.29.2 h36c2ea0_1008 conda-forge pkginfo 1.7.0 pyhd8ed1ab_0 conda-forge pluggy 0.13.1 py38h578d9bd_4 conda-forge pooch 1.4.0 pyhd8ed1ab_0 conda-forge poppler 21.03.0 h93df280_0 conda-forge poppler-data 0.4.10 0 conda-forge postgresql 13.3 h2510834_0 conda-forge proj 8.0.0 h277dcde_0 conda-forge prometheus_client 0.11.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.19 pyha770c72_0 conda-forge prompt_toolkit 3.0.19 hd8ed1ab_0 conda-forge protobuf 3.16.0 py38h709712a_0 conda-forge psutil 5.8.0 py38h497a2fe_1 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge py 1.10.0 pyhd3deb0d_0 conda-forge py-cpuinfo 8.0.0 pyhd8ed1ab_0 conda-forge py-lief 0.11.5 py38h709712a_0 conda-forge pyarrow 1.0.1 py38hb53058b_40_cuda conda-forge pyasn1 0.4.8 py_0 conda-forge pyasn1-modules 0.2.7 py_0 conda-forge pycodestyle 2.6.0 pyh9f0ad1d_0 conda-forge pycosat 0.6.3 py38h497a2fe_1006 conda-forge pycparser 2.20 pyh9f0ad1d_2 conda-forge pyct 0.4.6 py_0 conda-forge pyct-core 0.4.6 py_0 conda-forge pydeck 0.5.0 pyh9f0ad1d_0 conda-forge pydocstyle 6.1.1 pyhd8ed1ab_0 conda-forge pyee 7.0.4 pyh9f0ad1d_0 conda-forge pyflakes 2.2.0 pyh9f0ad1d_0 conda-forge pygments 2.9.0 pyhd8ed1ab_0 conda-forge pyjwt 2.1.0 pyhd8ed1ab_0 conda-forge pynndescent 0.5.2 pyh44b312d_0 conda-forge pynvml 11.0.0 pyhd8ed1ab_0 conda-forge pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge pyorc 0.4.0 py38hec75c54_1 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pyppeteer 0.2.2 py_1 conda-forge pyproj 3.0.1 py38hdf83a33_1 conda-forge pyrsistent 0.17.3 py38h497a2fe_2 conda-forge pysocks 1.7.1 py38h578d9bd_3 conda-forge pytest 6.2.4 py38h578d9bd_0 conda-forge pytest-asyncio 0.12.0 py38h32f6830_2 conda-forge pytest-benchmark 3.4.1 pyhd8ed1ab_0 conda-forge pytest-cov 2.12.1 pyhd8ed1ab_0 conda-forge pytest-forked 1.3.0 pyhd3deb0d_0 conda-forge pytest-timeout 1.4.2 pyh9f0ad1d_0 conda-forge pytest-xdist 2.3.0 pyhd8ed1ab_0 conda-forge python 3.8.10 h49503c6_1_cpython conda-forge python-confluent-kafka 1.5.0 py38h1e0a361_0 conda-forge python-dateutil 2.8.1 py_0 conda-forge python-jose 3.3.0 pyhd8ed1ab_0 conda-forge python-libarchive-c 3.1 py38h578d9bd_0 conda-forge python-louvain 0.15 pyhd3deb0d_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2021.1 pyhd8ed1ab_0 conda-forge pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge pyviz_comms 2.0.2 pyhd8ed1ab_0 conda-forge pywavelets 1.1.1 py38h5c078b8_3 conda-forge pyyaml 5.4.1 py38h497a2fe_0 conda-forge pyzmq 22.1.0 py38h2035c66_0 conda-forge rapidjson 1.1.0 he1b5a44_1002 conda-forge rapids-build-env 21.06.00 cuda11.2_py38_ge3c8282_427 rapidsai rapids-doc-env 21.06.00 py38_ge3c8282_427 rapidsai rapids-notebook-env 21.06.00 cuda11.2_py38_ge3c8282_427 rapidsai re2 2021.04.01 h9c3ff4c_0 conda-forge readline 8.1 h46c0cb4_0 conda-forge readme_renderer 27.0 pyh9f0ad1d_0 conda-forge recommonmark 0.7.1 pyhd8ed1ab_0 conda-forge regex 2021.4.4 py38h497a2fe_0 conda-forge requests 2.25.1 pyhd3deb0d_0 conda-forge requests-oauthlib 1.3.0 pyh9f0ad1d_0 conda-forge requests-toolbelt 0.9.1 py_0 conda-forge responses 0.13.3 pyhd8ed1ab_0 conda-forge rfc3986 1.5.0 pyhd8ed1ab_0 conda-forge rhash 1.4.1 h7f98852_0 conda-forge ripgrep 13.0.0 habb4d0f_0 conda-forge rmm 21.06.00 cuda_11.2_py38_gee432a0_0 rapidsai rsa 4.7.2 pyh44b312d_0 conda-forge rtree 0.9.7 py38h02d302b_1 conda-forge ruamel_yaml 0.15.80 py38h497a2fe_1004 conda-forge s2n 1.0.10 h9b69904_0 conda-forge s3fs 2021.6.0 pyhd8ed1ab_0 conda-forge s3transfer 0.3.7 pyhd8ed1ab_0 conda-forge sacremoses 0.0.43 pyh9f0ad1d_0 conda-forge scikit-image 0.18.1 py38h51da96c_0 conda-forge scikit-learn 0.23.1 py38h3a94b23_0 conda-forge scipy 1.6.0 py38hb2138dd_0 conda-forge seaborn 0.11.1 hd8ed1ab_1 conda-forge seaborn-base 0.11.1 pyhd8ed1ab_1 conda-forge secretstorage 3.3.1 py38h578d9bd_0 conda-forge send2trash 1.7.1 pyhd8ed1ab_0 conda-forge setuptools 49.6.0 py38h578d9bd_3 conda-forge shapely 1.7.1 py38haeee4fe_5 conda-forge shellcheck 0.7.2 ha770c72_1 conda-forge simpervisor 0.4 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.8 he1b5a44_3 conda-forge sniffio 1.2.0 py38h578d9bd_1 conda-forge snowballstemmer 2.1.0 pyhd8ed1ab_0 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge soupsieve 2.0.1 py_1 conda-forge spdlog 1.8.5 h4bd325d_0 conda-forge sphinx 4.0.2 pyh6c4a22f_1 conda-forge sphinx-click 3.0.1 pyhd8ed1ab_0 conda-forge sphinx-copybutton 0.3.3 pyhd8ed1ab_0 conda-forge sphinx-markdown-tables 0.0.15 pyhd3deb0d_0 conda-forge sphinx_rtd_theme 0.5.2 pyhd8ed1ab_0 conda-forge sphinxcontrib-applehelp 1.0.2 py_0 conda-forge sphinxcontrib-devhelp 1.0.2 py_0 conda-forge sphinxcontrib-htmlhelp 2.0.0 pyhd8ed1ab_0 conda-forge sphinxcontrib-jsmath 1.0.1 py_0 conda-forge sphinxcontrib-qthelp 1.0.3 py_0 conda-forge sphinxcontrib-serializinghtml 1.1.5 pyhd8ed1ab_0 conda-forge sphinxcontrib-websupport 1.2.4 pyh9f0ad1d_0 conda-forge sqlite 3.36.0 h9cd32fc_0 conda-forge sshpubkeys 3.1.0 py_0 conda-forge statsmodels 0.12.2 py38h5c078b8_0 conda-forge streamz 0.6.2 pyh44b312d_0 conda-forge tbb 2020.2 h4bd325d_4 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.10.1 py38h578d9bd_0 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge threadpoolctl 2.1.0 pyh5ca1d4c_0 conda-forge tifffile 2021.4.8 pyhd8ed1ab_0 conda-forge tiledb 2.2.9 h91fcb0e_0 conda-forge tk 8.6.10 h21135ba_1 conda-forge tokenizers 0.10.1 py38hb63a372_0 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge toolz 0.11.1 py_0 conda-forge tornado 6.1 py38h497a2fe_1 conda-forge tqdm 4.61.1 pyhd8ed1ab_0 conda-forge traitlets 5.0.5 py_0 conda-forge transformers 4.6.1 pyhd8ed1ab_0 conda-forge treelite 1.3.0 py38hd08a91b_0 conda-forge treelite-runtime 1.3.0 pypi_0 pypi twine 3.4.1 pyhd8ed1ab_0 conda-forge typed-ast 1.4.3 py38h497a2fe_0 conda-forge typing-extensions 3.10.0.0 hd8ed1ab_0 conda-forge typing_extensions 3.10.0.0 pyha770c72_0 conda-forge tzcode 2021a h7f98852_1 conda-forge tzdata 2021a he74cb21_0 conda-forge ucx 1.9.0+gcd9efd3 cuda11.2_0 rapidsai ucx-proc 1.0.0 gpu rapidsai ucx-py 0.20.0 py38_gcd9efd3_0 rapidsai umap-learn 0.5.1 py38h578d9bd_1 conda-forge urllib3 1.26.5 pyhd8ed1ab_0 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 0.57.0 py38h578d9bd_4 conda-forge websockets 8.1 py38h497a2fe_3 conda-forge werkzeug 2.0.1 pyhd8ed1ab_0 conda-forge wheel 0.36.2 pyhd3deb0d_0 conda-forge widgetsnbextension 3.5.1 py38h578d9bd_4 conda-forge wrapt 1.12.1 py38h497a2fe_3 conda-forge xarray 0.18.2 pyhd8ed1ab_0 conda-forge xerces-c 3.2.3 h9d8b166_2 conda-forge xmltodict 0.12.0 py_0 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.7.2 h7f98852_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h7f98852_1 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h7f98852_1002 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h516909a_0 conda-forge yarl 1.6.3 py38h497a2fe_1 conda-forge zeromq 4.3.4 h9c3ff4c_0 conda-forge zfp 0.5.5 h9c3ff4c_5 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.4.1 pyhd8ed1ab_0 conda-forge zlib 1.2.11 h516909a_1010 conda-forge zstd 1.4.9 ha95c52a_0 conda-forge ```
vinaydes commented 3 years ago

Looks like something related with conda env is messed up on your machine. You seem to have directory from rapids environment in your include path here -I/home/oleg/anaconda3/envs/rapids/include when you are actually using cuml_dev environment. Note that the header that is causing problem is from rapids environment /home/oleg/anaconda3/envs/rapids/include/thrust/system/cuda/detail/execution_policy.h:33. Does your PATH variable have some directories from rapids environment hard coded? May be you can try deleting the rapids environment and fresh build again.

Oleg-dM commented 3 years ago

Managed to compile cuML 20.06 on a fresh Ubuntu install and tested the new backend (split_algo=1) on 2 GTX 1050 -> works like a charm!! (compile ran after deleting env "rapids" but failed later on during compilation).

Thanks a lot @vinaydes and @hcho3 for following-up on this, really appreciated