Closed aryan-programmer closed 11 months ago
Please verify that this works for you now on the development branch.
It catches the error in the release build correctly.
However in the debug build while the launch parameter validation does catch the error properly, it also results in false alarms as follows:
Specifying the block_dimensions
to be 32 results in the following error:
Executing the kernel:
specified block X-axis dimension 32 exceeds the maximum supported X dimension of 1024 for device 0
Specifying the block_dimensions
to be 1024 results in the following error:
Executing the kernel:
specified block Y-axis dimension 1 exceeds the maximum supported Y dimension of 1024 for device 0
Both of these should have worked and printed "Hello World". Indeed, in release (with the block_dimensions
being 32 or 1024) it outputs:
Executing the kernel:
Hello CUDA
Can you try again?
It works correctly now.
The
enqueue_raw_kernel_launch_in_current_context
does not check for an error withcudaGetLastError
, thus kernel launch errors likecudaErrorInvalidConfiguration
go uncaught, and the kernel launch silently fails.Minimal example:
The following example demonstrates an example kernel, whose launch should fail since the maximum number of threads per block is 1024, and we are trying to launch it with 1500 threads.
Current output:
The kernel launch fails silently.
Expected output:
The kernel launch error from
cudaGetLastError
is handled and converted into an exception that is caught here.