cloudcores / CuAssembler

An unofficial cuda assembler, for all generations of SASS, hopefully :)
MIT License
391 stars 69 forks source link

An error occurred when analyzing kernel function calls transitioning from cuasm to cubin using Nsight Compute #23

Open Lukinon opened 5 months ago

Lukinon commented 5 months ago

When I convert the kernel to sass assembly, I use this tool to convert cuasm to cubin, and call the cubin in the c++program. The direct execution result is correct, but an error occurs when using Nsight Compute to analyze: ==Error==LaunchFailed ==Error==LaunchFailed ==PROF==Trying to shutdown target application ==Error==The application returned an error code (9) ==Error==An error occurred while trying to profile I encountered the same error on both the 2022 and 2024 versions of Nsight Compute, however, the cubin file generated directly using NVCC can be analyzed normally using Nsight Compute

cloudcores commented 5 months ago

You mean the program returns error code 9?

 (cudaErrorInvalidConfiguration = 9
    This indicates that a kernel launch is requesting resources that can never be satisfied by the current device. 
Requesting more shared memory per block than the device supports will trigger this error, as will requesting 
too many threads or blocks. See [cudaDeviceProp](https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaDeviceProp.html#structcudaDeviceProp) for more device limitations.

Did you check the correctness when run directly? Kernel launch will not emit error unless checked by user explicitly, but can be caught by nsys/ncu in profiling mode.