apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.9k stars 3.38k forks source link

[Python] test_cuda_numba_interop fails locally #40677

Open pitrou opened 3 months ago

pitrou commented 3 months ago

Describe the bug, including details regarding any error messages, version, and platform.

I get some failures locally when running test_cuda_numba_interop. This is probably because I have some rather old CUDA hardware, but I wonder why some of those tests (and not all of them) are requiring 8.4 while my device has compute capability 8.0. This used to work some (long?) time ago.

Example traceback:

_________________________________________________________________ test_numba_memalloc[uint8-pyarrow.cuda] _________________________________________________________________
Traceback (most recent call last):
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 2824, in add_ptx
    driver.cuLinkAddData(self.handle, enums.CU_JIT_INPUT_PTX,
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 327, in safe_cuda_api_call
    self._check_ctypes_error(fname, retcode)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 395, in _check_ctypes_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [222] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  [...]
  File "/home/antoine/arrow/dev/python/pyarrow/tests/test_cuda_numba_interop.py", line 169, in test_numba_memalloc
    darr[:5] = 99
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/devices.py", line 232, in _require_cuda_context
    return fn(*args, **kws)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/devicearray.py", line 667, in __setitem__
    return self._do_setitem(key, value)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/devicearray.py", line 726, in _do_setitem
    _assign_kernel(lhs.ndim).forall(n_elements, stream=stream)(lhs, rhs)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/dispatcher.py", line 486, in __call__
    specialized = self.dispatcher.specialize(*args)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/dispatcher.py", line 715, in specialize
    specialization.compile(argtypes)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/dispatcher.py", line 926, in compile
    kernel.bind()
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/dispatcher.py", line 197, in bind
    self._codelibrary.get_cufunc()
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/codegen.py", line 195, in get_cufunc
    cubin = self.get_cubin(cc=device.compute_capability)
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/codegen.py", line 170, in get_cubin
    linker.add_ptx(ptx.encode())
  File "/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 2827, in add_ptx
    raise LinkerError("%s\n%s" % (e, self.error_log))
numba.cuda.cudadrv.driver.LinkerError: [222] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION
ptxas application ptx input, line 9; fatal   : Unsupported .version 8.4; current version is '8.0'
---------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------
ERROR    numba.cuda.cudadrv.driver:driver.py:392 Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION

I have a Pascal GPU (GeForce GT 1030).

$ nvidia-smi 
Tue Mar 19 17:19:19 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:04:00.0  On |                  N/A |
| 35%   36C    P0    N/A /  19W |    579MiB /  2048MiB |      5%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Component(s)

GPU, Python

pitrou commented 3 months ago

@kkraus14 Any ideas here?

kkraus14 commented 3 months ago

The PTX ISA version is 8.4 but your driver only supports 8.0. I'm not sure why Numba would.be generating a newer PTX version.

@gmarkall any chance you could help here?