cholla-hydro / cholla

A GPU-based hydro code
https://github.com/cholla-hydro/cholla/wiki
MIT License
60 stars 32 forks source link

Consolidate GPU Error Checking Function #350

Closed bcaddy closed 7 months ago

bcaddy commented 8 months ago

GPU Error Checking

I replaced the 3 macros and 7 functions for GPU error checking with a single overloaded function; one overload for CUDA/HIP checking and one for CUFFT/HIPFFT checking. The function supports wrapping a CUDA call or being called with no arguments to check the latest error.

I also added error checking to some cudaMallocs that were missing them or used them in a non-standard way.

The other major change is the deprecation of the CUDA_ERROR_CHECK macro. Now error checking is on by default and can be disabled with the new DISABLE_GPU_ERROR_CHECKING macro.

This should resolve Issue #286 and possibly #296 as well, subject to discussion in that issue.

bcaddy commented 7 months ago

Yep, everywhere that the implicit sync seemed like it might be relevant were on functions that already contain an implicit sync (like moving or allocating). Since we only use 1 GPU stream there's an implicit sync between all kernels.