desihub / redrock

Redshift fitting for spectroperfectionism
BSD 3-Clause "New" or "Revised" License
22 stars 13 forks source link

redrock GPU test should check for GPU first #227

Closed sbailey closed 1 year ago

sbailey commented 1 year ago

Redrock tests should first check if a GPU is available, and skip any tests that require a GPU. test_rebin and test_utils tests are failing on cori for this reason. It appears that the tests are checking if cupy is installed, but not sufficiently testing if a GPU is available (that check is inside the class setUp, but perhaps not propagating to the cp_available global variable). If an entire test is to be skipped, it would be better to use @unittest.skipIf(...) wrappers so that is is reported as being purposefully skipped instead of passing.

======================================================================
ERROR: test_gpu_trapzrebin_uneven (redrock.test.test_rebin.TestRebin)
test rebinning unevenly spaced x for GPU vs CPU
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/code/redrock/main/py/redrock/test/test_rebin.py", line 122, in test_gpu_trapzrebin_uneven
    g = rebin.trapz_rebin(x, y, edges=edges, use_gpu=True)
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/code/redrock/main/py/redrock/rebin.py", line 321, in trapz_rebin
    return _trapz_rebin_batch_gpu(x, y, edges=edges, myz=myz, result_shape=result_shape)
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/code/redrock/main/py/redrock/rebin.py", line 376, in _trapz_rebin_batch_gpu
    edges = cp.asarray(edges, dtype=cp.float64)
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/conda/lib/python3.9/site-packages/cupy/_creation/from_data.py", line 76, in asarray
    return _core.array(a, dtype, False, order)
  File "cupy/_core/core.pyx", line 2249, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2270, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2402, in cupy._core.core._array_default
  File "cupy/_core/core.pyx", line 171, in cupy._core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 698, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1375, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1395, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/device.pyx", line 48, in cupy.cuda.device.get_device_id
  File "cupy_backends/cuda/api/runtime.pyx", line 159, in cupy_backends.cuda.api.runtime.getDevice
  File "cupy_backends/cuda/api/runtime.pyx", line 132, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected

======================================================================
ERROR: test_Lyman_transmission_GPU (redrock.test.test_utils.TestUtils)
Test the GPU version of Lyman transmission in batch versus
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/code/redrock/main/py/redrock/test/test_utils.py", line 62, in test_Lyman_transmission_GPU
    g = utils.transmission_Lyman(myz,x,use_gpu=True)
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/code/redrock/main/py/redrock/utils.py", line 264, in transmission_Lyman
    lObs = tile(lObs, (zObj.size, 1))
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/conda/lib/python3.9/site-packages/cupy/_manipulation/tiling.py", line 29, in tile
    c = cupy.array(A, copy=False, ndmin=d)
  File "/global/common/software/desi/cori/desiconda/20211217-2.0.0/conda/lib/python3.9/site-packages/cupy/_creation/from_data.py", line 46, in array
    return _core.array(obj, dtype, copy, order, subok, ndmin)
  File "cupy/_core/core.pyx", line 2249, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2270, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2402, in cupy._core.core._array_default
  File "cupy/_core/core.pyx", line 171, in cupy._core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 698, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1375, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1395, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/device.pyx", line 48, in cupy.cuda.device.get_device_id
  File "cupy_backends/cuda/api/runtime.pyx", line 159, in cupy_backends.cuda.api.runtime.getDevice
  File "cupy_backends/cuda/api/runtime.pyx", line 132, in cupy_backends.cuda.api.runtime.check_status
cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected
sbailey commented 1 year ago

Fixed by PR #228. Closing.