argonne-lcf / THAPI

A tracing infrastructure for heterogeneous computing applications.
Other
22 stars 9 forks source link

Unsupported API error tracing CUDA application #71

Closed markdewing closed 1 year ago

markdewing commented 1 year ago

When using iprof to trace a CUDA application, it stops with the error:

/mnt/prev/home/mdewing/physics/hep/patatrack/pixeltrack-standalone/src/cuda/CUDACore/host_noncached_unique_ptr.h, line 52:
cudaCheck(cudaHostAlloc(&mem, sizeof(T), flags));
cudaErrorCallRequiresNewerDriver: API call is not supported in the installed CUDA driver

The application runs fine w/o iprof.

System is running Ubuntu 22.04.02. Default cuda is 11.8. Driver version is 525.85.12 (Cuda 12.0). I recompiled the app with cuda 12.0, but it still fails.

A version of this code built using OpenMP offload runs just fine through iprof (at least one CUDA application works with iprof on this machine).

Two ideas that I haven't looked at:

  1. Does iprof need to be compiled with cuda 12.0? (There doesn't seem to be any way to specify the CUDA location in the configuration)
  2. iprof was built using babeltrace 2.0.4 (unpatched)
markdewing commented 1 year ago

Using 'nsys' from cuda 11.8 gives a similar error. Using 'nsys' from cuda 12.0 works.

TApplencourt commented 1 year ago

@Kerilk is the expert on the tracepoint generated. Maybe we need to update the header that we use. But it's weird that 12.0 work but not 11.8. Does cuda is not Backward compatible?

Kerilk commented 1 year ago

It means I need to upgrade iprof to support the most recent CUDA APIs I only implement up to 11.something. I'll look into it. There is a mismatch between the cuda runtime version and driver version in this case

Kerilk commented 1 year ago

I updated to APIs to 12.1, we'll see if it yields better results.