Closed markdewing closed 1 year ago
Using 'nsys' from cuda 11.8 gives a similar error. Using 'nsys' from cuda 12.0 works.
@Kerilk is the expert on the tracepoint generated. Maybe we need to update the header that we use. But it's weird that 12.0 work but not 11.8. Does cuda is not Backward compatible?
It means I need to upgrade iprof to support the most recent CUDA APIs I only implement up to 11.something. I'll look into it. There is a mismatch between the cuda runtime version and driver version in this case
I updated to APIs to 12.1, we'll see if it yields better results.
When using iprof to trace a CUDA application, it stops with the error:
The application runs fine w/o iprof.
System is running Ubuntu 22.04.02. Default cuda is 11.8. Driver version is 525.85.12 (Cuda 12.0). I recompiled the app with cuda 12.0, but it still fails.
A version of this code built using OpenMP offload runs just fine through iprof (at least one CUDA application works with iprof on this machine).
Two ideas that I haven't looked at: