scikit-hep / awkward

Manipulate JSON-like data with NumPy-like idioms.
BSD 3-Clause "New" or "Revised" License
826 stars 85 forks source link

GPU: tests fail with `TypeError: can_cast()` #3205

Open ianna opened 1 month ago

ianna commented 1 month ago

Version of Awkward Array


Description and code to reproduce

25 tests-cuda and 131 tests-cuda-kernels-explicit fail what looks like in min/max reducers with TypeError: can_cast()

awkward-cpp 37
cupy 13.2.0
cuda-version 12.6
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/awkward/ in __call__
    self._impl(grid, blocks, args)
        ak_cuda    = <module 'awkward._connect.cuda' from '/home/yana/Projects/PR3205/awkward/src/awkward/_connect/cuda/'>
        args       = (array([123, 123, 123, 123], dtype=uint8), array([1, 3, 5, 4, 2, 2, 3, 1, 5], dtype=uint8), array([0, 0, 0, 0, 0, 2, 2, 2, 3]), 9, 4, 4, ...)
        blocks     = (9, 1, 1)
        cupy       = <module 'cupy' from '/home/yana/miniconda3/envs/awkward-cuda/lib/python3.12/site-packages/cupy/'>
        cupy_stream_ptr = 0
        grid       = (1, 1, 1)
        maxlength  = 9
        self       = <CupyKernel awkward_reduce_min, uint8, uint8, int64>
src/awkward/_connect/cuda/ in f
    temp = cupy.full(lenparents, identity, dtype=toptr.dtype)
        args       = (array([123, 123, 123, 123], dtype=uint8), array([1, 3, 5, 4, 2, 2, 3, 1, 5], dtype=uint8), array([0, 0, 0, 0, 0, 2, 2, 2, 3]), 9, 4, 4, ...)
        block      = (9, 1, 1)
        cuda_kernel_templates = <cupy._core.raw.RawModule object at 0x7f233daaeda0>
        err_code   = array(18446744073709551615, dtype=uint64)
        fromptr    = array([1, 3, 5, 4, 2, 2, 3, 1, 5], dtype=uint8)
        grid       = (1, 1, 1)
        grid_size  = 1
        identity   = 4
        invocation_index = 411
        lenparents = 9
        outlength  = 4
        parents    = array([0, 0, 0, 0, 0, 2, 2, 2, 3])
        toptr      = array([123, 123, 123, 123], dtype=uint8)
../../../miniconda3/envs/awkward-cuda/lib/python3.12/site-packages/cupy/_creation/ in full
    cupy.copyto(a, fill_value, casting='unsafe')
        a          = array([123,   0,   0,   0,   0,   0,   0,   0, 123], dtype=uint8)
        dtype      = dtype('uint8')
        fill_value = 4
        order      = 'C'
        shape      = 9
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

dst = array([123,   0,   0,   0,   0,   0,   0,   0, 123], dtype=uint8), src = 4, casting = 'unsafe', where = None

    def copyto(dst, src, casting='same_kind', where=None):
        """Copies values from one array to another with broadcasting.

        This function can be called for arrays on different devices. In this case,
        casting, ``where``, and broadcasting is not supported, and an exception is
        raised if these are used.

            dst (cupy.ndarray): Target array.
            src (cupy.ndarray): Source array.
            casting (str): Casting rule. See :func:`numpy.can_cast` for detail.
            where (cupy.ndarray of bool): If specified, this array acts as a mask,
                and an element is copied only if the corresponding element of
                ``where`` is True.

        .. seealso:: :func:`numpy.copyto`

        src_is_numpy_scalar = False

        src_type = type(src)
        src_is_python_scalar = src_type in (
            int, bool, float, complex,
            fusion._FusionVarScalar, _fusion_interface._ScalarProxy)
        if src_is_python_scalar:
            src_dtype = numpy.dtype(type(src))
>           can_cast = numpy.can_cast(src, dst.dtype, casting)
E           TypeError: can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.
E           This change was part of adopting NEP 50, we may explicitly allow them again in the future.

casting    = 'unsafe'
dst        = array([123,   0,   0,   0,   0,   0,   0,   0, 123], dtype=uint8)
src        = 4
src_dtype  = dtype('int64')
src_is_numpy_scalar = False
src_is_python_scalar = True
src_type   = <class 'int'>
where      = None

../../../miniconda3/envs/awkward-cuda/lib/python3.12/site-packages/cupy/_manipulation/ TypeError
jpivarski commented 1 month ago

Oddly, I can't reproduce these failures with my GPU. My package versions are

# Name                    Version                   Build  Channel
awkward-cpp               37                       pypi_0    pypi
cupy                      13.2.0          py311he5a987b_1    conda-forge
cupy-core                 13.2.0          py311h3bdf873_1    conda-forge
cuda-version              12.4                 h3060b56_3    conda-forge

CUDA driver Version: 550.67 NVIDIA GeForce RTX 3060

I ran all of the CUDA-related tests and observed no errors.

Maybe it's a difference between our GPUs, but also be sure to do a clean installation of Awkward,

pip uninstall awkward awkward-cpp

followed by the nox, pip install ./awkward-cpp, pip install -e . sequence, just in case it's a discrepancy from an old file.

ianna commented 1 month ago
awkward$ conda list
# packages in environment at /home/yana/miniconda3/envs/awkward-cuda:
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
argcomplete               3.4.0              pyhd8ed1ab_0    conda-forge
awkward                   2.6.7                    pypi_0    pypi
awkward-cpp               37                       pypi_0    pypi
bzip2                     1.0.8                h5eee18b_6
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cachetools                5.4.0              pyhd8ed1ab_0    conda-forge
chardet                   5.2.0           py312h7900ff3_1    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
colorlog                  6.8.2           py312h7900ff3_0    conda-forge
cuda-nvrtc                12.6.20              he02047a_0    conda-forge
cuda-version              12.6                 h7480c83_3    conda-forge
cupy                      13.2.0          py312had87585_1    conda-forge
cupy-core                 13.2.0          py312hd074ebb_1    conda-forge
distlib                   0.3.8              pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h6a678d5_0
fastrlock                 0.8.2           py312h30efb56_2    conda-forge
filelock                  3.15.4             pyhd8ed1ab_0    conda-forge
fsspec                    2024.6.1                 pypi_0    pypi
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
ld_impl_linux-64          2.38                 h1181459_1
libblas                   3.9.0           23_linux64_openblas    conda-forge
libcblas                  3.9.0           23_linux64_openblas    conda-forge
libcublas                   he02047a_0    conda-forge
libcufft                    he02047a_0    conda-forge
libcurand                   he02047a_0    conda-forge
libcusolver                 he02047a_0    conda-forge
libcusparse                 he02047a_0    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.4                h6a678d5_1
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libgomp                   14.1.0               h77fa898_0    conda-forge
liblapack                 3.9.0           23_linux64_openblas    conda-forge
libllvm14                 14.0.6               hcd5def8_4    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.6.20              he02047a_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libsqlite                 3.45.2               h2797004_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
llvmlite                  0.43.0          py312h9c5d478_0    conda-forge
markupsafe                2.1.5           py312h98912ed_0    conda-forge
ncurses                   6.4                  h6a678d5_0
nox                       2024.4.15          pyhff2d567_0    conda-forge
numba                     0.60.0          py312h83e6fd3_0    conda-forge
numba-cuda                0.0.13                     py_0    nvidia
numpy                     2.0.1           py312h1103770_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pip                       24.0            py312h06a4308_0
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
pluggy                    1.5.0              pyhd8ed1ab_0    conda-forge
pyproject-api             1.7.1              pyhd8ed1ab_0    conda-forge
pytest                    8.3.2              pyhd8ed1ab_0    conda-forge
python                    3.12.2          hab00c5b_0_cpython    conda-forge
python_abi                3.12                    4_cp312    conda-forge
readline                  8.2                  h5eee18b_0
setuptools                72.1.0          py312h06a4308_0
sqlite                    3.45.2               h2c6b66d_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tox                       4.17.0             pyhd8ed1ab_0    conda-forge
tzdata                    2024a                h04d1e81_0
virtualenv                20.26.3            pyhd8ed1ab_0    conda-forge
wheel                     0.43.0          py312h06a4308_0
xz                        5.4.6                h5eee18b_1
zlib                      1.3.1                h4ab18f5_1    conda-forge
ianna commented 1 month ago
Thu Aug  8 13:07:36 2024
| NVIDIA-SMI 560.27                 Driver Version: 560.70         CUDA Version: 12.6     |
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3080        On  |   00000000:01:00.0  On |                  N/A |
|  0%   36C    P8             23W /  320W |     406MiB /  10240MiB |      0%      Default |
|                                         |                        |                  N/A |

| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|    0   N/A  N/A        26      G   /Xwayland                                   N/A      |
ianna commented 1 month ago
/awkward$ python
Python 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:50:58) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
ianna commented 1 month ago

@jpivarski - it looks like we need a newer cupy version for that. The bug has been reported in

jpivarski commented 1 month ago

We can increase our lower bound on CuPy to whatever is needed. (CuPy is an optional dependency.)