When running kneighbors (brute force algo, two_pass_precission=True) against a 2M record dataset (searching all 2M records in the training set) it blows up with:
CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
[2156337 rows x 384 columns]
Traceback (most recent call last):
File "/cuMLKNN.py", line 67, in <module>
distances, indices = nn.kneighbors(sample_df, two_pass_precision=True)
File "/venv/lib/python3.10/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
return func(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
File "venv/lib/python3.10/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
return func(*args, **kwargs)
File "base.pyx", line 665, in cuml.internals.base.UniversalBase.dispatch_func
File "nearest_neighbors.pyx", line 535, in cuml.neighbors.nearest_neighbors.NearestNeighbors.kneighbors
File "nearest_neighbors.pyx", line 651, in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors_internal
File "venv/lib/python3.10/site-packages/cupy/_sorting/sort.py", line 116, in argsort
return a.argsort(axis=axis)
File "cupy/_core/core.pyx", line 874, in cupy._core.core._ndarray_base.argsort
File "cupy/_core/core.pyx", line 891, in cupy._core.core._ndarray_base.argsort
File "cupy/_core/_routines_sorting.pyx", line 88, in cupy._core._routines_sorting._ndarray_argsort
File "cupy/_core/core.pyx", line 611, in cupy._core.core._ndarray_base.copy
File "cupy/_core/core.pyx", line 570, in cupy._core.core._ndarray_base.astype
File "cupy/_core/core.pyx", line 132, in cupy._core.core.ndarray.__new__
File "cupy/_core/core.pyx", line 220, in cupy._core.core._ndarray_base._init
File "cupy/cuda/memory.pyx", line 740, in cupy.cuda.memory.alloc
File "venv/lib/python3.10/site-packages/rmm/allocators/cupy.py", line 37, in rmm_cupy_allocator
buf = librmm.device_buffer.DeviceBuffer(size=nbytes, stream=stream)
File "device_buffer.pyx", line 85, in rmm._lib.device_buffer.DeviceBuffer.__cinit__
MemoryError: std::bad_alloc: CUDA error at: /__w/rmm/rmm/include/rmm/mr/device/managed_memory_resource.hpp:74: cudaErrorIllegalAddress an illegal memory access was encountered
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/_skbuild/linux-x86_64-3.10/cmake-build/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=33 failed with an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Exception ignored in: 'cupy.cuda.function.Module.__dealloc__'
Traceback (most recent call last):
File "cupy_backends/cuda/api/driver.pyx", line 217, in cupy_backends.cuda.api.driver.moduleUnload
File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Error in sys.excepthook:
Environment details (please complete the following information):
Environment location: Bare-metal
Linux Distro/Architecture: Ubuntu 22.04 amd64
GPU Model/Driver: NVIDIA RTX A5500 Laptop GPU
CUDA: 12.2
Method of cuDF & cuML install: cuda toolkit from repo
Thanks for opening issue about this , @phact. Can you provide the whole script you are running so we have a minimal reproducible example (MRE) to reproduce the issue on our side?
Describe the bug
When running kneighbors (brute force algo, two_pass_precission=True) against a 2M record dataset (searching all 2M records in the training set) it blows up with:
CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
I'm using rmm :
Full stack trace:
Environment details (please complete the following information): Environment location: Bare-metal Linux Distro/Architecture: Ubuntu 22.04 amd64 GPU Model/Driver: NVIDIA RTX A5500 Laptop GPU CUDA: 12.2 Method of cuDF & cuML install: cuda toolkit from repo