JuliaGPU / CUDA.jl

CUDA programming in Julia.
https://juliagpu.org/cuda/
Other
1.2k stars 218 forks source link

Improve exception output #2342

Closed maleadt closed 5 months ago

maleadt commented 5 months ago

This PR expands the simple exception flag to an info struct that can contain much more information. I then use it to:

Fixes https://github.com/JuliaGPU/CUDA.jl/issues/1780, https://github.com/JuliaGPU/CUDA.jl/issues/2341, significantly improving the output.

Before:

julia> using CUDA

julia> a = cu([1])
1-element CuArray{Int64, 1, CUDA.DeviceMemory}:
 1

julia> kernel(a) = (a[threadIdx().x]; nothing)
kernel (generic function with 1 method)

julia> @cuda threads=3 kernel(a)
CUDA.HostKernel for kernel(CuDeviceVector{Int64, 1})

julia> ERROR: Out-of-bounds array access.
ERROR: Out-of-bounds array access.
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
 [1] throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:4
 [1] throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:4
 [2] #throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:42
 [2] #throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:42
 [3] checkbounds at ./abstractarray.jl:702
 [3] checkbounds at ./abstractarray.jl:702
 [4] #arrayref at /home/tim/Julia/pkg/CUDA/src/device/array.jl:81
 [4] #arrayref at /home/tim/Julia/pkg/CUDA/src/device/array.jl:81
 [5] getindex at /home/tim/Julia/pkg/CUDA/src/device/array.jl:164
 [5] getindex at /home/tim/Julia/pkg/CUDA/src/device/array.jl:164
 [6] kernel at ./REPL[3]:1
 [6] kernel at ./REPL[3]:1

After:

julia> @cuda threads=3 kernel(a)
CUDA.HostKernel for kernel(CuDeviceVector{Int64, 1})

julia> ERROR: a BoundsError was thrown during kernel execution on thread (2, 1, 1) in block (1, 1, 1).
Out-of-bounds array access
Stacktrace:
 [1] throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:15
 [2] #throw_boundserror at /home/tim/Julia/pkg/CUDA/src/device/quirks.jl:53
 [3] checkbounds at ./abstractarray.jl:702
 [4] #arrayref at /home/tim/Julia/pkg/CUDA/src/device/array.jl:81
 [5] getindex at /home/tim/Julia/pkg/CUDA/src/device/array.jl:164
 [6] kernel at ./REPL[4]:1

Note that this still doesn't cover all exception generating sites though, e.g., if code does throw(BoundsError()) and we don't have a quirk that provides additional exception information, we still report a simple Exception. An IR-level transformation to recover that info would be great, but we currently don't have the tooling for that in GPUCompiler.

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 71.85%. Comparing base (eb45b2c) to head (b658315). Report is 3 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2342 +/- ## ========================================== - Coverage 71.86% 71.85% -0.02% ========================================== Files 155 155 Lines 15074 15072 -2 ========================================== - Hits 10833 10830 -3 - Misses 4241 4242 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.