conda-forge / cupy-feedstock

A conda-smithy repository for cupy.
BSD 3-Clause "New" or "Revised" License
5 stars 23 forks source link

CuPy v9.0.0rc1 #114

Closed leofang closed 3 years ago

leofang commented 3 years ago

Checklist

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

leofang commented 3 years ago

@kmaehashi @jaimergp Let me know you have difficulty in downloading the artifacts or setting up a test env. The Linux tests fail during artifact upload because the filenames are too long and get truncated, but we don't care about them anyway.

leofang commented 3 years ago

(Sorry, tagged the wrong person 😅)

leofang commented 3 years ago

Step 1: Click the up-down arrows to expand the tab Step 2: Click the "published" button image Step 3: Choose the right artifact (win64, python version x cuda version) and click the dots to download image Step 4: Conda-install all dependencies first, and then conda-install the downloaded file (as a file-based install would not install dependencies)

kmaehashi commented 3 years ago

I'll start testing this now.

leofang commented 3 years ago

Oops didn't realize the screenshots are extended to full size on a desktop browser 😂

jaimergp commented 3 years ago

I am running conda build --test package.tar.bz. Anything else?

leofang commented 3 years ago

Thanks, @jaimergp! If you have a GPU, just try these simply CuPy operations (that have NumPy equivalents):

import cupy as cp
cp.show_config()
a = cp.random.random(100)
print(a.sum())

If all dlls are correctly found (including nvcuda.dll), these should be trivial ops.

leofang commented 3 years ago

Just to ensure I got it right on the optional dependencies (it's correct on Linux): If you just install CuPy without cudnn or cutensor, this should give you two False:

import cupy.cuda.cudnn
print(cupy.cuda.cudnn.available)
import cupy.cuda.cutensor
print(cupy.cuda.cutensor.available)

However, if the lib is installed (say, conda install -c conda-forge cudnn), it should return True.

leofang commented 3 years ago

(btw I think you could also do conda install package.tar.bz and its channel would appear as "local". As mentioned, the caveat is conda would not attempt to resolve its dependencies, so it's installed as if you do conda install --no-deps package.)

leofang commented 3 years ago

The Linux tests fail during artifact upload because the filenames are too long and get truncated

Reported to upstream: https://github.com/conda-forge/conda-smithy/issues/1466

jaimergp commented 3 years ago

Installed with:

conda create -n cupy ./cupy-9.0.0rc1-py36h944a138_0.tar.bz2
conda activate cupy
conda update --all

conda list:

(cupy) PS C:\Users\Jaime> conda list
# packages in environment at C:\Users\Jaime\Miniconda3\envs\cupy:
#
# Name                    Version                   Build  Channel
certifi                   2020.12.5        py36ha15d459_1    conda-forge
cudatoolkit               11.0.3               h3f58a73_8    conda-forge
cupy                      9.0.0rc1         py36h944a138_0    <unknown>
fastrlock                 0.6              py36he2d232f_0    conda-forge
intel-openmp              2020.3             h57928b3_311    conda-forge
libblas                   3.9.0                     8_mkl    conda-forge
libcblas                  3.9.0                     8_mkl    conda-forge
liblapack                 3.9.0                     8_mkl    conda-forge
mkl                       2020.4             hb70f87d_311    conda-forge
numpy                     1.19.5           py36hd1b969e_1    conda-forge
python                    3.6.13          h39d44d4_0_cpython    conda-forge
python_abi                3.6                     1_cp36m    conda-forge
setuptools                49.6.0           py36ha15d459_3    conda-forge
vc                        14.2                 hb210afc_4    conda-forge
vs2015_runtime            14.28.29325          h5e1d092_4    conda-forge
wincertstore              0.2             py36ha15d459_1006    conda-forge

This is what I get:

Python 3.6.13 (default, Feb 19 2021, 05:17:09) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy as cp
>>> cp.show_config()
OS                           : Windows-10-10.0.21337-SP0
CuPy Version                 : 9.0.0rc1
NumPy Version                : 1.19.5
SciPy Version                : None
Cython Build Version         : 0.29.22
Cython Runtime Version       : None
CUDA Root                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0
CUDA Build Version           : 11000
CUDA Driver Version          : 11030
CUDA Runtime Version         : 11000
cuBLAS Version               : 11200
cuFFT Version                : 10201
cuRAND Version               : 10201
cuSOLVER Version             : (10, 6, 0)
cuSPARSE Version             : 11101
NVRTC Version                : (11, 0)
Thrust Version               : 100909
CUB Build Version            : 100909
Jitify Build Version         : <unknown>
cuDNN Build Version          : None
cuDNN Version                : None
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
Device 0 Name                : GeForce RTX 2070
Device 0 Compute Capability  : 75
>>> a = cp.random.random(100)
>>> print(a.sum())
51.6567999004485
>>> import cupy.cuda.cudnn
C:\Users\Jaime\Miniconda3\envs\cupy\lib\site-packages\cupy\_environment.py:376: UserWarning:
cudnn library could not be loaded.

Reason: ImportError (DLL load failed: The specified module could not be found.)

You can install the library by:

  $ conda install -c conda-forge cudnn

  warnings.warn(msg)
>>> print(cupy.cuda.cudnn.available)
False
>>> import cupy.cuda.cutensor
C:\Users\Jaime\Miniconda3\envs\cupy\lib\site-packages\cupy\_environment.py:376: UserWarning:
cutensor library could not be loaded.

Reason: ImportError (DLL load failed: The specified module could not be found.)

You can install the library by:

  $ conda install -c conda-forge cutensor

  warnings.warn(msg)
>>> print(cupy.cuda.cutensor.available)
False
jaimergp commented 3 years ago

After conda install -c conda-forge cudnn cutensor:

conda list:

(cupy) PS C:\Users\Jaime> conda list
# packages in environment at C:\Users\Jaime\Miniconda3\envs\cupy:
#
# Name                    Version                   Build  Channel
certifi                   2020.12.5        py36ha15d459_1    conda-forge
cudatoolkit               11.0.3               h3f58a73_8    conda-forge
cudnn                     8.1.0.77             h3e0f4f4_0    conda-forge
cupy                      9.0.0rc1         py36h944a138_0    <unknown>
cutensor                  1.2.2.5              h5987435_3    conda-forge
fastrlock                 0.6              py36he2d232f_0    conda-forge
intel-openmp              2020.3             h57928b3_311    conda-forge
libblas                   3.9.0                     8_mkl    conda-forge
libcblas                  3.9.0                     8_mkl    conda-forge
liblapack                 3.9.0                     8_mkl    conda-forge
mkl                       2020.4             hb70f87d_311    conda-forge
numpy                     1.19.5           py36hd1b969e_1    conda-forge
python                    3.6.13          h39d44d4_0_cpython    conda-forge
python_abi                3.6                     1_cp36m    conda-forge
setuptools                49.6.0           py36ha15d459_3    conda-forge
vc                        14.2                 hb210afc_4    conda-forge
vs2015_runtime            14.28.29325          h5e1d092_4    conda-forge
wincertstore              0.2             py36ha15d459_1006    conda-forge

Optional deps tests:

(cupy) PS C:\Users\Jaime> python
Python 3.6.13 (default, Feb 19 2021, 05:17:09) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy.cuda.cudnn
>>> print(cupy.cuda.cudnn.available)
True
>>> import cupy.cuda.cutensor
>>> print(cupy.cuda.cutensor.available)
True
leofang commented 3 years ago

Awesome Jaime!!! Many thanks! Things work as expected, right @kmaehashi?

leofang commented 3 years ago

I will clean up the recipe and apply the suggestions from the other PR shortly. The artifacts are still accessible at https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=295780&view=results.

leofang commented 3 years ago

@jaimergp @kmaehashi One more thing: It'd be nice if you could also test on Python 3.8+, since the dll loading behavior is different and we internally call os.add_dll_directory().

jaimergp commented 3 years ago

Sure thing.

Also, I am still running the test suite and I see some failures already at 25% of completion. I will paste the log once it is done.

jaimergp commented 3 years ago

With Python 3.8, looks good:

conda create -n cupy38 .\cupy-9.0.0rc1-py38h6ccc6bc_0.tar.bz2
conda activate cupy38
conda update --all
conda install -c conda-forge cutensor cudnn

conda list:

(cupy38) PS C:\Users\Jaime\Downloads> conda list
# packages in environment at C:\Users\Jaime\Miniconda3\envs\cupy38:
#
# Name                    Version                   Build  Channel
ca-certificates           2020.12.5            h5b45459_0    conda-forge
certifi                   2020.12.5        py38haa244fe_1    conda-forge
cudatoolkit               11.0.3               h3f58a73_8    conda-forge
cudnn                     8.1.0.77             h3e0f4f4_0    conda-forge
cupy                      9.0.0rc1         py38h6ccc6bc_0    <unknown>
cutensor                  1.2.2.5              h5987435_3    conda-forge
fastrlock                 0.6              py38h885f38d_0    conda-forge
intel-openmp              2020.3             h57928b3_311    conda-forge
libblas                   3.9.0                     8_mkl    conda-forge
libcblas                  3.9.0                     8_mkl    conda-forge
liblapack                 3.9.0                     8_mkl    conda-forge
mkl                       2020.4             hb70f87d_311    conda-forge
numpy                     1.20.1           py38h0cc643e_0    conda-forge
openssl                   1.1.1k               h8ffe710_0    conda-forge
python                    3.8.8           h7840368_0_cpython    conda-forge
python_abi                3.8                      1_cp38    conda-forge
setuptools                49.6.0           py38haa244fe_3    conda-forge
sqlite                    3.35.2               h8ffe710_0    conda-forge
vc                        14.2                 hb210afc_4    conda-forge
vs2015_runtime            14.28.29325          h5e1d092_4    conda-forge
wincertstore              0.2             py38haa244fe_1006    conda-forge

Tests:

(cupy38) PS C:\Users\Jaime\Downloads> python
Python 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 15:50:08) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy as cp
>>> cp.show_config()
OS                           : Windows-10-10.0.21337-SP0
CuPy Version                 : 9.0.0rc1
NumPy Version                : 1.20.1
SciPy Version                : None
Cython Build Version         : 0.29.22
Cython Runtime Version       : None
CUDA Root                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0
CUDA Build Version           : 11000
CUDA Driver Version          : 11030
CUDA Runtime Version         : 11000
cuBLAS Version               : 11200
cuFFT Version                : 10201
cuRAND Version               : 10201
cuSOLVER Version             : (10, 6, 0)
cuSPARSE Version             : 11101
NVRTC Version                : (11, 0)
Thrust Version               : 100909
CUB Build Version            : 100909
Jitify Build Version         : <unknown>
cuDNN Build Version          : 8100
cuDNN Version                : 8100
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : 10202
Device 0 Name                : GeForce RTX 2070
Device 0 Compute Capability  : 75
>>> a = cp.random.random(100)
>>> print(a.sum())
48.36837744664295
>>> import cupy.cuda.cudnn
>>> print(cupy.cuda.cudnn.available)
True
>>> import cupy.cuda.cutensor
>>> print(cupy.cuda.cutensor.available)
True
kmaehashi commented 3 years ago

Preloading warning is working as expected! Thank you @leofang!

(base) C:\Users\chainer>python
Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy
>>> import cupy.cuda.cutensor
C:\Users\chainer\miniconda3\lib\site-packages\cupy\_environment.py:376: UserWarning:
cutensor library could not be loaded.

Reason: ImportError (DLL load failed while importing cutensor: The specified module could not be found.)

You can install the library by:

  $ conda install -c conda-forge cutensor

  warnings.warn(msg)
>>> import cupy.cuda.cutensor
>>> import cupy.cuda.nccl
>>> import cupy.cuda.cudnn
C:\Users\chainer\miniconda3\lib\site-packages\cupy\_environment.py:376: UserWarning:
cudnn library could not be loaded.

Reason: ImportError (DLL load failed while importing cudnn: The specified module could not be found.)

You can install the library by:

  $ conda install -c conda-forge cudnn

  warnings.warn(msg)

After installing cutesnor and cudnn:

>>> import cupy
>>> import cupy.cuda.cutensor
>>> cupy.cuda.cutensor.available
True
>>> import cupy.cuda.cudnn
>>> cupy.cuda.cudnn.available
True
kmaehashi commented 3 years ago

FYI:

All CUDA libraries seem installed under %CONDA_PREFIX%\Library\bin on Windows. Directory tree (under my miniconda3 directory): https://gist.github.com/kmaehashi/47906a8d7fcc2b0f11e97a82486d88ca

PATH and sys.path on Windows miniconda3:

(base) C:\Users\chainer>python
Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print('\n'.join(sys.path))

C:\Users\chainer\miniconda3\python38.zip
C:\Users\chainer\miniconda3\DLLs
C:\Users\chainer\miniconda3\lib
C:\Users\chainer\miniconda3
C:\Users\chainer\miniconda3\lib\site-packages
C:\Users\chainer\miniconda3\lib\site-packages\win32
C:\Users\chainer\miniconda3\lib\site-packages\win32\lib
C:\Users\chainer\miniconda3\lib\site-packages\Pythonwin
>>> print(os.environ['PATH'].replace(';', '\n'))
C:\Users\chainer\miniconda3
C:\Users\chainer\miniconda3\Library\mingw-w64\bin
C:\Users\chainer\miniconda3\Library\usr\bin
C:\Users\chainer\miniconda3\Library\bin
C:\Users\chainer\miniconda3\Scripts
C:\Users\chainer\miniconda3\bin
C:\Users\chainer\miniconda3\condabin
C:\Development\Python\Python37
C:\Development\Python\Python37\Scripts
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp
C:\Windows\system32
C:\Windows
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0
C:\ProgramData\GooGet
C:\Program Files\Google\Compute Engine\metadata_scripts
C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin
C:\Program Files\Google\Compute Engine\sysprep
C:\Program Files\Git\cmd
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files\7-Zip
C:\Program Files\curl-7.64.0-win64-mingw\bin
C:\Program Files\NVIDIA Corporation\NVIDIA NGX
C:\Program Files\dotnet
C:\Program Files\NVIDIA Corporation\Nsight Compute 2020.3.0
C:\Users\chainer\AppData\Local\Microsoft\WindowsApps

(note: C:\Development\Python\Python37 and below are settings in my environment, not related to conda)

jaimergp commented 3 years ago

Failing tests here. I'll provide the logs now because somebody forgot to pipe things to a file... :)

=========================== short test summary info ===========================
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_0_{order='C', shape=(2,)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_1_{order='C', shape=(2, 3)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_2_{order='C', shape=(2, 3, 4)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_3_{order='C', shape=(2, 3, 4, 5)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_4_{order='F', shape=(2,)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_5_{order='F', shape=(2, 3)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_6_{order='F', shape=(2, 3, 4)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelContiguity_param_7_{order='F', shape=(2, 3, 4, 5)}::test_can_use_cub_contiguous
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelMisc::test_can_use_accelerator_set_unset
FAILED tests/cupy_tests/core_tests/test_cub_reduction.py::TestSimpleCubReductionKernelMisc::test_can_use_cub_oversize_input2
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_8_{backend='block', order='C', shape=(10,)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_8_{backend='block', order='C', shape=(10,)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_9_{backend='block', order='C', shape=(10, 20)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_9_{backend='block', order='C', shape=(10, 20)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_10_{backend='block', order='C', shape=(10, 20, 30)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_10_{backend='block', order='C', shape=(10, 20, 30)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_11_{backend='block', order='C', shape=(10, 20, 30, 40)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_11_{backend='block', order='C', shape=(10, 20, 30, 40)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_12_{backend='block', order='F', shape=(10,)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_12_{backend='block', order='F', shape=(10,)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_13_{backend='block', order='F', shape=(10, 20)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_13_{backend='block', order='F', shape=(10, 20)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_14_{backend='block', order='F', shape=(10, 20, 30)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_14_{backend='block', order='F', shape=(10, 20, 30)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_15_{backend='block', order='F', shape=(10, 20, 30, 40)}::test_cub_max
FAILED tests/cupy_tests/core_tests/test_ndarray_reduction.py::TestCubReduction_param_15_{backend='block', order='F', shape=(10, 20, 30, 40)}::test_cub_min
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_0_{backend='nvrtc', in_memory=False}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_0_{backend='nvrtc', in_memory=False}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_0_{backend='nvrtc', in_memory=False}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_1_{backend='nvrtc', in_memory=True}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_1_{backend='nvrtc', in_memory=True}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_1_{backend='nvrtc', in_memory=True}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_2_{backend='nvrtc', clean_up=True, in_memory=True}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_2_{backend='nvrtc', clean_up=True, in_memory=True}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_2_{backend='nvrtc', clean_up=True, in_memory=True}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_3_{backend='nvrtc', in_memory=False, jitify=True}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_3_{backend='nvrtc', in_memory=False, jitify=True}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_3_{backend='nvrtc', in_memory=False, jitify=True}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_4_{backend='nvrtc', in_memory=True, jitify=True}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_4_{backend='nvrtc', in_memory=True, jitify=True}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_4_{backend='nvrtc', in_memory=True, jitify=True}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_5_{backend='nvrtc', clean_up=True, in_memory=True, jitify=True}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_5_{backend='nvrtc', clean_up=True, in_memory=True, jitify=True}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_5_{backend='nvrtc', clean_up=True, in_memory=True, jitify=True}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_basic
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_compile_kernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_compile_module
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_compiler_flag
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_const_memory
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_dynamical_parallelism
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_dynamical_parallelism_compile_failure
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_get_function_failure
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_invalid_compiler_flag
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_kernel_attributes
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_load_cubin
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_load_ptx
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_module
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRaw_param_6_{backend='nvcc', in_memory=False}::test_raw_pointer
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_0_{block=64, n=10}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_0_{block=64, n=10}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_1_{block=64, n=100}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_1_{block=64, n=100}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_2_{block=64, n=1000}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_2_{block=64, n=1000}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_3_{block=256, n=10}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_3_{block=256, n=10}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_4_{block=256, n=100}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_4_{block=256, n=100}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_5_{block=256, n=1000}::test_grid_sync_rawkernel
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawGridSync_param_5_{block=256, n=1000}::test_grid_sync_rawmodule
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_0_{compile=False, raw='ker'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_1_{compile=False, raw='mod'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_2_{compile=False, raw='mod_ker'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_3_{compile=True, raw='ker'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_4_{compile=True, raw='mod'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/test_raw.py::TestRawPicklable_param_5_{compile=True, raw='mod_ker'}::test_raw_picklable
FAILED tests/cupy_tests/core_tests/fusion_tests/test_example.py::TestFusionExample::test_batchnorm
FAILED tests/cupy_tests/core_tests/fusion_tests/test_indexing.py::TestIndexingCombination::test_indexing_twice_3
FAILED tests/cupy_tests/core_tests/fusion_tests/test_misc.py::TestFusionTuple::test_various_shape
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_0_{axis=None}::test_postmap_inplace
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_0_{axis=None}::test_postmap_two_arrays
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_0_{axis=None}::test_premap_postmap
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_1_{axis=0}::test_postmap_inplace
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_1_{axis=0}::test_postmap_two_arrays
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionReductionAndElementwise_param_1_{axis=0}::test_premap_postmap
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_0_{axis1=None, axis2=None}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_0_{axis1=None, axis2=None}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_1_{axis1=None, axis2=0}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_1_{axis1=None, axis2=0}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_2_{axis1=None, axis2=1}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_2_{axis1=None, axis2=1}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_3_{axis1=0, axis2=None}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_3_{axis1=0, axis2=None}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_4_{axis1=0, axis2=0}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_4_{axis1=0, axis2=0}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_5_{axis1=0, axis2=1}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_6_{axis1=1, axis2=None}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_6_{axis1=1, axis2=None}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_7_{axis1=1, axis2=0}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_8_{axis1=1, axis2=1}::test_two_distinct_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultipleReductions_param_8_{axis1=1, axis2=1}::test_two_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultistageReductions::test_multistage_reductions
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultistageReductions::test_multistage_reductions_and_elementwise
FAILED tests/cupy_tests/core_tests/fusion_tests/test_reduction.py::TestFusionMultistageReductionsMultiAxis::test_multistage_reductions
FAILED tests/cupy_tests/functional_tests/test_vectorize.py::TestVectorizeExprs::test_vectorize_typecast
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_8_{backend='block', order='C', shape=(10,)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_8_{backend='block', order='C', shape=(10,)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_9_{backend='block', order='C', shape=(10, 20)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_9_{backend='block', order='C', shape=(10, 20)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_10_{backend='block', order='C', shape=(10, 20, 30)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_10_{backend='block', order='C', shape=(10, 20, 30)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_11_{backend='block', order='C', shape=(10, 20, 30, 40)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_11_{backend='block', order='C', shape=(10, 20, 30, 40)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_12_{backend='block', order='F', shape=(10,)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_12_{backend='block', order='F', shape=(10,)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_13_{backend='block', order='F', shape=(10, 20)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_13_{backend='block', order='F', shape=(10, 20)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_14_{backend='block', order='F', shape=(10, 20, 30)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_14_{backend='block', order='F', shape=(10, 20, 30)}::test_cub_sum
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_15_{backend='block', order='F', shape=(10, 20, 30, 40)}::test_cub_prod
FAILED tests/cupy_tests/math_tests/test_sumprod.py::TestCubReduction_param_15_{backend='block', order='F', shape=(10, 20, 30, 40)}::test_cub_sum
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_16_{backend='block', order_and_axis=('C', -1), shape=(10,)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_16_{backend='block', order_and_axis=('C', -1), shape=(10,)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_17_{backend='block', order_and_axis=('C', -1), shape=(10, 20)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_17_{backend='block', order_and_axis=('C', -1), shape=(10, 20)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_18_{backend='block', order_and_axis=('C', -1), shape=(10, 20, 30)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_18_{backend='block', order_and_axis=('C', -1), shape=(10, 20, 30)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_19_{backend='block', order_and_axis=('C', -1), shape=(10, 20, 30, 40)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_19_{backend='block', order_and_axis=('C', -1), shape=(10, 20, 30, 40)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_20_{backend='block', order_and_axis=('C', None), shape=(10,)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_20_{backend='block', order_and_axis=('C', None), shape=(10,)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_21_{backend='block', order_and_axis=('C', None), shape=(10, 20)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_21_{backend='block', order_and_axis=('C', None), shape=(10, 20)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_22_{backend='block', order_and_axis=('C', None), shape=(10, 20, 30)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_22_{backend='block', order_and_axis=('C', None), shape=(10, 20, 30)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_23_{backend='block', order_and_axis=('C', None), shape=(10, 20, 30, 40)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_23_{backend='block', order_and_axis=('C', None), shape=(10, 20, 30, 40)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_24_{backend='block', order_and_axis=('F', 0), shape=(10,)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_24_{backend='block', order_and_axis=('F', 0), shape=(10,)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_25_{backend='block', order_and_axis=('F', 0), shape=(10, 20)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_25_{backend='block', order_and_axis=('F', 0), shape=(10, 20)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_26_{backend='block', order_and_axis=('F', 0), shape=(10, 20, 30)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_26_{backend='block', order_and_axis=('F', 0), shape=(10, 20, 30)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_27_{backend='block', order_and_axis=('F', 0), shape=(10, 20, 30, 40)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_27_{backend='block', order_and_axis=('F', 0), shape=(10, 20, 30, 40)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_28_{backend='block', order_and_axis=('F', None), shape=(10,)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_28_{backend='block', order_and_axis=('F', None), shape=(10,)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_29_{backend='block', order_and_axis=('F', None), shape=(10, 20)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_29_{backend='block', order_and_axis=('F', None), shape=(10, 20)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_30_{backend='block', order_and_axis=('F', None), shape=(10, 20, 30)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_30_{backend='block', order_and_axis=('F', None), shape=(10, 20, 30)}::test_cub_argmin
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_31_{backend='block', order_and_axis=('F', None), shape=(10, 20, 30, 40)}::test_cub_argmax
FAILED tests/cupy_tests/sorting_tests/test_search.py::TestCubReduction_param_31_{backend='block', order_and_axis=('F', None), shape=(10, 20, 30, 40)}::test_cub_argmin
= 153 failed, 23120 passed, 4860 skipped, 15 xfailed, 1 xpassed, 7732 warnings in 9683.80s (2:41:23) =
leofang commented 3 years ago

Thanks, @jaimergp. Just wanna check: you don't have a local CUDA installation on Windows (outside of Conda), right? Then these failures are expected because they all most of them rely on nvcc.

EDIT: some look suspicious...

jaimergp commented 3 years ago

I do have a CUDA 11.0 installation :/

leofang commented 3 years ago

I do have a CUDA 11.0 installation :/

Oh! Does it match with the cudatoolkit version that you installed in this test env?

jaimergp commented 3 years ago

Yup! Buy maybe it's finding 10.2 first? (I have both)

leofang commented 3 years ago

Could be. Which nvcc is returned if you run where nvcc?

jaimergp commented 3 years ago

I think it has to do more with my VS compiler. I am rerunning now with a new config!

leofang commented 3 years ago

You can just test the failing files, say pytest tests/cupy_tests/core_tests/test_raw.py (this is the one I'm mostly concerned with).

jaimergp commented 3 years ago

Yep, I am only running those. It still takes some time 😬

jakirkham commented 3 years ago

Would one need to install some of the optional dependencies for those tests to work?

leofang commented 3 years ago

No, all of these tests don't involve optional deps.

jaimergp commented 3 years ago

Some fail if scipy is not installed, though.

jaimergp commented 3 years ago

Results here. With a properly initialized VS compiler, most pass. However some nvrtc stuff with multi-GPU is failing. I only have one GPU so I guess that's expected?

$>  python -m pytest ^
        tests/cupy_tests/core_tests/test_cub_reduction.py ^
        tests/cupy_tests/core_tests/test_ndarray_reduction.py ^
        tests/cupy_tests/core_tests/test_raw.py ^
        tests/cupy_tests/core_tests/fusion_tests/test_example.py ^
        tests/cupy_tests/core_tests/fusion_tests/test_indexing.py ^
        tests/cupy_tests/core_tests/fusion_tests/test_misc.py ^
        tests/cupy_tests/core_tests/fusion_tests/test_reduction.py ^
        tests/cupy_tests/functional_tests/test_vectorize.py ^
        tests/cupy_tests/sorting_tests/test_search.py ^
        >  log.txt

log.txt

leofang commented 3 years ago

Some fail if scipy is not installed, though.

You're right....I only had optional "CUDA" deps in mind 😅

With a properly initialized VS compiler, most pass. However some nvrtc stuff with multi-GPU is failing. I only have one GPU so I guess that's expected?

Wow tons of thanks for helping us test it out @jaimergp!!! I don't remember if we've ever received full test reports from users! Those multi-GPU tests are likely not properly skipped. I'll fix them in upstream.

leofang commented 3 years ago

Those multi-GPU tests are likely not properly skipped. I'll fix them in upstream.

No, wait, I already sent a patch for it long ago: https://github.com/cupy/cupy/pull/3342, but @kmaehashi convinced me it's not needed 😅

jaimergp commented 3 years ago

No problems! Let me know if there's anything else I should try. I am logging off for the weekend though so it'll have to wait until next week :)

leofang commented 3 years ago

Thanks so much @jaimergp, I think we're almost done here. Have a nice weekend!

jakirkham commented 3 years ago

So what's the verdict here Leo? Should we go ahead with this or do we want to work out the remaining test issues?

leofang commented 3 years ago

I think @jaimergp and @kmaehashi have verified the needed tests (windows & preloading optional deps), so feel free to take a final round of review and merge @jakirkham.

leofang commented 3 years ago

CI is green!

leofang commented 3 years ago

Thanks everyone for your help!