rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.23k stars 532 forks source link

[BUG] DBSCAN crashed instantly with data having a large #rows #5393

Open albert-cwkuo opened 1 year ago

albert-cwkuo commented 1 year ago

Describe the bug When feeding DBSCAN.fit_predict with data x having many a large #rows, it crashed instantly with the following error:

RuntimeError: CUDA error encountered at: file=/project/python/_skbuild/linux-x86_64-3.8/cmake-build/_deps/raft-src/cpp/include/raft/spatial/knn/detail/epsilon_neighborhood.cuh line=200: call='cudaGetLastError()', Reason=cudaErrorInvalidConfiguration:invalid configuration argument
Obtained 64 stack frames
#0 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x38) [0x7f7dc633dd28]
#1 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft10cuda_errorC1ERKSs+0x38) [0x7f7dc633e518]
#2 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft7spatial3knn6detail21epsUnexpL2SqNeighImplIflLi4EEEvPbPT0_PKT_S9_S5_S5_S5_S7_P11CUstream_st+0x37f) [0x7f7dc640771f]
#3 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(+0x20067d2) [0x7f7dc63da7d2]
#4 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN2ML6Dbscan3runIflLb0EEEmRKN4raft8handle_tEPKT_T0_S9_S9_S9_S6_S9_PS9_SA_iiiPvmP11CUstream_stNS2_8distance12DistanceTypeE+0x662) [0x7f7dc6421412]
#5 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN2ML6Dbscan13dbscanFitImplIflLb0EEEvRKN4raft8handle_tEPT_T0_S8_S6_S8_NS2_8distance12DistanceTypeEPS8_SB_mP11CUstream_sti+0x1336) [0x7f7dc64255e6]
#6 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x2a1a8) [0x7f7ed16b61a8]
#7 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#8 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#9 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#10 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#11 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x25bad) [0x7f7ed16b1bad]
#12 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x27d04) [0x7f7ed16b3d04]
#13 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#14 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#15 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#16 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#17 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x25bad) [0x7f7ed16b1bad]
#18 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x2883f) [0x7f7ed16b483f]
#19 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#20 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#21 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#22 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e9674]
#23 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x4d48) [0x4cce88]
#24 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#25 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#26 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCode+0x1b) [0x56d93b]
#27 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x571390]
#28 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4dbb17]
#29 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#30 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#31 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1d68) [0x4c9ea8]
#32 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#33 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1d68) [0x4c9ea8]
#34 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#35 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e747d]
#36 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#37 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#38 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#39 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#40 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#41 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#42 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e9674]
#43 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1725) [0x4c9865]
#44 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#45 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#46 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#47 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#48 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#49 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#50 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#51 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x19c) [0x4db2ac]
#52 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e95e7]
#53 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x5e) [0x4ed77e]
#54 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#55 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#56 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x19c) [0x4db2ac]
#57 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#58 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#59 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#60 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCode+0x1b) [0x56d93b]
#61 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x58cc71]
#62 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x586a2f]
#63 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x590072]

Steps/Code to reproduce bug Here's the code snippet to reproduce the bug with X having 5M rows

import numpy as np
from cuml import DBSCAN
x = np.random.random((5000000, 768)).astype(np.float32)
dbscan = DBSCAN(min_samples = 1, eps = 5.0)
labels = dbscan.fit_predict(x)

Expected behavior The last line labels = dbscan.fit_predict(x) crashes immediately.

Environment details (please complete the following information):

Additional context This issue seems relevant to this solved issue: https://github.com/rapidsai/cuml/issues/1753

RijndertAriese commented 1 month ago

@albert-cwkuo, Running in the same error, did you find a solution in the end?

dantegd commented 1 month ago

@RijndertAriese do you have details of how you ran into this issue?

xMHW commented 4 days ago

@dantegd I've tried with data with 5M rows, and it reproduces the same error. Any updates?

Tested on CUDA: 12.0 Linux: Ubuntu 20.04 amd64 GPU: RTX3090, Driver Version: 525.60.11 cuML installed with pip

Below are traceback

  File "/data/cuml-test/find_optimum_outlier_param.py", line 307, in <module>
    main_run(args.dataset_name, args.collection_text_filepath, args.collection_embedding_filepath, args.remove_outlier, args.normalize, args.target_ratios)
  File "/data/cuml-test/find_optimum_outlier_param.py", line 214, in main_run
    outlier_label = dbscan_outlier(normalized_target_docids_embs, eps=eps, min_samples=5)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/find_optimum_outlier_param.py", line 122, in dbscan_outlier
    outlier_label = dbscan.fit_predict(data, np.int64)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "base.pyx", line 687, in cuml.internals.base.UniversalBase.dispatch_func
  File "dbscan.pyx", line 466, in cuml.cluster.dbscan.DBSCAN.fit_predict
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "base.pyx", line 687, in cuml.internals.base.UniversalBase.dispatch_func
  File "dbscan.pyx", line 442, in cuml.cluster.dbscan.DBSCAN.fit
  File "/data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "dbscan.pyx", line 353, in cuml.cluster.dbscan.DBSCAN._fit
RuntimeError: CUDA error encountered at: file=/__w/cuml/cuml/python/cuml/build/cp312-cp312-linux_x86_64/_deps/raft-src/cpp/include/raft/spatial/knn/detail/epsilon_neighborhood.cuh line=197: call='cudaGetLastError()', Reason=cudaErrorInvalidConfiguration:invalid configuration argument
Obtained 30 stack frames
#1 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so: raft::cuda_error::cuda_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) +0x5a [0x7f5d582ef58a]
#2 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so: void raft::spatial::knn::detail::epsUnexpL2SqNeighImpl<float, long, 4>(bool*, long*, float const*, float const*, long, long, long, float, CUstream_st*) +0x3f1 [0x7f5d584e0ac1]
#3 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so: void ML::Dbscan::VertexDeg::Algo::launcher<float, long>(raft::handle_t const&, ML::Dbscan::VertexDeg::Pack<float, long>, long, long, CUstream_st*, raft::distance::DistanceType) +0x591 [0x7f5d585f3511]
#4 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so: void ML::Dbscan::VertexDeg::run<float, long>(raft::handle_t const&, raft::neighbors::ball_cover::BallCoverIndex<long, float, long, long>*, long*, rmm::device_uvector<long>*, long, bool*, long*, float*, float const*, float const*, float, long, long, int, long, long, CUstream_st*, raft::distance::DistanceType) +0x2bd [0x7f5d585f492d]
#5 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so(+0xa82b92) [0x7f5d58757b92]
#6 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/../libcuml++.so: void ML::Dbscan::dbscanFitImpl<float, long, false>(raft::handle_t const&, float*, long, long, float, long, raft::distance::DistanceType, long*, long*, float*, unsigned long, ML::Dbscan::EpsNnMethod, CUstream_st*, int) +0x1804 [0x7f5d5875bf24]
#7 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/cluster/dbscan.cpython-312-x86_64-linux-gnu.so(+0x2f5ae) [0x7f5cafcd15ae]
#8 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyEval_EvalFrameDefault +0x919 [0x7f5e54c13999]
#9 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/cluster/dbscan.cpython-312-x86_64-linux-gnu.so(+0x3806e) [0x7f5cafcda06e]
#10 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/base.cpython-312-x86_64-linux-gnu.so(+0x1043e) [0x7f5cc002043e]
#11 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/base.cpython-312-x86_64-linux-gnu.so(+0x217a2) [0x7f5cc00317a2]
#12 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyEval_EvalFrameDefault +0x919 [0x7f5e54c13999]
#13 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0(+0x176df3) [0x7f5e54c7ddf3]
#14 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyEval_EvalFrameDefault +0x919 [0x7f5e54c13999]
#15 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/cluster/dbscan.cpython-312-x86_64-linux-gnu.so(+0x37a32) [0x7f5cafcd9a32]
#16 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/base.cpython-312-x86_64-linux-gnu.so(+0x1043e) [0x7f5cc002043e]
#17 in /data/cuml-test/venv/lib/python3.12/site-packages/cuml/internals/base.cpython-312-x86_64-linux-gnu.so(+0x217a2) [0x7f5cc00317a2]
#18 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyEval_EvalFrameDefault +0x919 [0x7f5e54c13999]
#19 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0(+0x176df3) [0x7f5e54c7ddf3]
#20 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyEval_EvalFrameDefault +0x919 [0x7f5e54c13999]
#21 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: PyEval_EvalCode +0x217 [0x7f5e54d909f7]
#22 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0(+0x2e4bf6) [0x7f5e54debbf6]
#23 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0(+0x2e4d05) [0x7f5e54debd05]
#24 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyRun_SimpleFileObject +0x17b [0x7f5e54deebab]
#25 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: _PyRun_AnyFileObject +0x3f [0x7f5e54def12f]
#26 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0(+0x30ef39) [0x7f5e54e15f39]
#27 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: Py_RunMain +0x2a [0x7f5e54e162ea]
#28 in /root/.pyenv/versions/3.12.4/lib/libpython3.12.so.1.0: Py_BytesMain +0x5e [0x7f5e54e164be]
#29 in /lib/x86_64-linux-gnu/libc.so.6: __libc_start_main +0xf3 [0x7f5e5492a083]
#30 in python: _start +0x2e [0x56349198409e]

Also, tested with Conda Installed on other device. And get the same error

Traceback (most recent call last):
  File "/data/cuml-test/find_optimum_outlier_param.py", line 307, in <module>
    main_run(args.dataset_name, args.collection_text_filepath, args.collection_embedding_filepath, args.remove_outlier, args.normalize, args.target_ratios)
  File "/data/cuml-test/find_optimum_outlier_param.py", line 214, in main_run
    outlier_label = dbscan_outlier(normalized_target_docids_embs, eps=eps, min_samples=5)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/cuml-test/find_optimum_outlier_param.py", line 122, in dbscan_outlier
    outlier_label = dbscan.fit_predict(data, np.int64)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "base.pyx", line 687, in cuml.internals.base.UniversalBase.dispatch_func
  File "dbscan.pyx", line 466, in cuml.cluster.dbscan.DBSCAN.fit_predict
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "base.pyx", line 687, in cuml.internals.base.UniversalBase.dispatch_func
  File "dbscan.pyx", line 442, in cuml.cluster.dbscan.DBSCAN.fit
  File "/root/miniconda3/envs/rapids-24.10/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "dbscan.pyx", line 353, in cuml.cluster.dbscan.DBSCAN._fit
RuntimeError: CUDA error encountered at: file=/root/miniconda3/envs/rapids-24.10/include/raft/spatial/knn/detail/epsilon_neighborhood.cuh line=197: 

This machine is CUDA: 12.2 Linux: Ubuntu 20.04 amd64 GPU: RTX TITAN, Driver Version: 535.183.01 cuML installed with conda

beckernick commented 4 days ago

Thanks for updating this thread.

As a temporary workaround, would you be able to try HDBSCAN instead of DBSCAN? This PyData conference talk on HDBSCAN makes a compelling case for using it vs. DBSCAN -- and it hopefully shouldn't run into this issue.