bsc-performance-tools / extrae

Instrumentation framework to generate execution traces of the most used parallel runtimes.
https://tools.bsc.es/extrae
GNU Lesser General Public License v2.1
58 stars 35 forks source link

CUDA builds crash since v4.0.3 #79

Closed mofeing closed 1 year ago

mofeing commented 1 year ago

I'm trying to build Extrae v4.0.3 with CUDA tracing through the CUPTI interface but I get the following error:

/../../src/common -I/workspace/destdir/include -I/workspace/destdir/cuda/include -I/workspace/destdir/cuda/extras/CUPTI/include  -g -O2 -fno-optimize-sibling-calls -Wall -W -c -o libwrap_cuda_la-cuda_wrapper_cupti.lo `test -f 'cuda_wrapper_cupti.c' || echo './'`cuda_wrapper_cupti.c
libtool: compile:  cc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../src/common/MPI -I../../../../src/tracer -I../../../../src/tracer/hwc -I../../../../src/tracer/clocks -I../../../../src/tracer/interfaces/API -I../../../../src/tracer/wrappers/API -I../../../.. -I../../../../include -I../../../../src/common -I/workspace/destdir/include -I/workspace/destdir/cuda/include -I/workspace/destdir/cuda/extras/CUPTI/include -g -O2 -fno-optimize-sibling-calls -Wall -W -c cuda_wrapper_cupti.c  -fPIC -DPIC -o .libs/libwrap_cuda_la-cuda_wrapper_cupti.o
cuda_wrapper_cupti.c: In function ‘Extrae_DriverAPI_callback’:
cuda_wrapper_cupti.c:146:29: warning: passing argument 1 of ‘Extrae_cudaLaunch_Enter’ from incompatible pointer type [-Wincompatible-pointer-types]
     Extrae_cudaLaunch_Enter(p->f, p->hStream);
                             ^
In file included from cuda_wrapper_cupti.c:44:0:
cuda_common.h:179:6: note: expected ‘const char *’ but argument is of type ‘CUfunction {aka struct CUfunc_st *}’
 void Extrae_cudaLaunch_Enter (const char*, cudaStream_t);
      ^
cuda_wrapper_cupti.c: In function ‘Extrae_RuntimeAPI_callback’:
cuda_wrapper_cupti.c:167:4: error: unknown type name ‘cudaConfigureCall_v3020_params’
    cudaConfigureCall_v3020_params *p =
    ^
cuda_wrapper_cupti.c:168:7: error: ‘cudaConfigureCall_v3020_params’ undeclared (first use in this function)
      (cudaConfigureCall_v3020_params*)cbinfo->functionParams;
       ^
cuda_wrapper_cupti.c:168:7: note: each undeclared identifier is reported only once for each function it appears in
cuda_wrapper_cupti.c:168:38: error: expected expression before ‘)’ token
      (cudaConfigureCall_v3020_params*)cbinfo->functionParams;
                                      ^
cuda_wrapper_cupti.c:172:8: error: request for member ‘gridDim’ in something not a structure or union
       p->gridDim, p->blockDim, p->sharedMem, p->stream
        ^
cuda_wrapper_cupti.c:172:20: error: request for member ‘blockDim’ in something not a structure or union
       p->gridDim, p->blockDim, p->sharedMem, p->stream
                    ^
cuda_wrapper_cupti.c:172:33: error: request for member ‘sharedMem’ in something not a structure or union
       p->gridDim, p->blockDim, p->sharedMem, p->stream
                                 ^
cuda_wrapper_cupti.c:172:47: error: request for member ‘stream’ in something not a structure or union
       p->gridDim, p->blockDim, p->sharedMem, p->stream
                                               ^
cuda_wrapper_cupti.c:184:4: error: unknown type name ‘cudaLaunch_v3020_params’
    cudaLaunch_v3020_params *p =
    ^
cuda_wrapper_cupti.c:185:7: error: ‘cudaLaunch_v3020_params’ undeclared (first use in this function)
      (cudaLaunch_v3020_params*)cbinfo->functionParams;
       ^
cuda_wrapper_cupti.c:185:31: error: expected expression before ‘)’ token
      (cudaLaunch_v3020_params*)cbinfo->functionParams;
                               ^
cuda_wrapper_cupti.c:189:30: error: request for member ‘func’ in something not a structure or union
     Extrae_cudaLaunch_Enter(p->func, NULL);
                              ^
cuda_wrapper_cupti.c:298:6: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
      (cudaMallocArray_v3020_params *)cbinfo->functionParams;
      ^
cuda_wrapper_cupti.c:439:4: error: unknown type name ‘cudaStreamDestroy_v3020_params’
    cudaStreamDestroy_v3020_params *p =
    ^
cuda_wrapper_cupti.c:440:7: error: ‘cudaStreamDestroy_v3020_params’ undeclared (first use in this function)
      (cudaStreamDestroy_v3020_params*)cbinfo->functionParams;
       ^
cuda_wrapper_cupti.c:440:38: error: expected expression before ‘)’ token
      (cudaStreamDestroy_v3020_params*)cbinfo->functionParams;
                                      ^
cuda_wrapper_cupti.c:443:38: error: request for member ‘stream’ in something not a structure or union
     Extrae_cudaStreamDestroy_Enter (p->stream);
                                      ^
make[5]: *** [Makefile:851: libwrap_cuda_la-cuda_wrapper_cupti.lo] Error 1
make[5]: Leaving directory '/workspace/srcdir/extrae-4.0.3/src/tracer/wrappers/CUDA'
make[4]: *** [Makefile:803: all-recursive] Error 1
make[4]: Leaving directory '/workspace/srcdir/extrae-4.0.3/src/tracer/wrappers'
make[3]: *** [Makefile:7073: all-recursive] Error 1
make[3]: Leaving directory '/workspace/srcdir/extrae-4.0.3/src/tracer'
make[2]: *** [Makefile:795: all-recursive] Error 1
make[2]: Leaving directory '/workspace/srcdir/extrae-4.0.3/src'
make[1]: *** [Makefile:1275: all-recursive] Error 1
make[1]: Leaving directory '/workspace/srcdir/extrae-4.0.3'
make: *** [Makefile:1207: all] Error 2

Looks like the the cudaConfigureCall_v3020_params, cudaLaunch_v3020_params and cudaStreamDestroy_v3020_params are defined nowhere. I think CUPTI header do not provide these definitions since CUPTI v12 (CUDA 10?) and commit 388dd3b removed the hardcoded definitions that you had.

emercadal commented 1 year ago

Hi, thanks for reporting.

It's a known issue. As you say they are deprecated and were removed in CUDA 10.2, they are still available in CUDA 10 and CUDA 10.1. All of those use the same CUPTI version (12) and despite Extrae already checks the CUPTI version to define these structures or not, the fact that CUDA uses the same version number for different implementations, makes the check fail.

We are looking for a more robust solution to this problem. In the meantime you could try modifying src/tracer/wrappers/CUDA/cuda_wrapper_cupti.h and change the CUPTI_API_VERSION check to 11.

mofeing commented 1 year ago

Changing the check (or even removing the check) still gives the same error.

As far as I see, no file includes cuda_wrapper_cupti.h so the types are not defined. The file is included in the resulting Makefile but I think is not automatically included.

sandbox:${WORKSPACE}/srcdir/extrae-4.0.3 # grep -R cuda_wrapper_cupti.h src/tracer/wrappers/CUDA/
src/tracer/wrappers/CUDA/Makefile:am__append_1 = cuda_wrapper_cupti.c cuda_wrapper_cupti.h
src/tracer/wrappers/CUDA/Makefile:      cuda_wrapper_cupti.h cuda_wrapper.c cuda_wrapper.h
src/tracer/wrappers/CUDA/Makefile.am:WRAPPERS_CUDA += cuda_wrapper_cupti.c cuda_wrapper_cupti.h
src/tracer/wrappers/CUDA/Makefile.in:@HAVE_CUDA_TRUE@@HAVE_CUPTI_TRUE@am__append_1 = cuda_wrapper_cupti.c cuda_wrapper_cupti.h
src/tracer/wrappers/CUDA/Makefile.in:   cuda_wrapper_cupti.h cuda_wrapper.c cuda_wrapper.h
mofeing commented 1 year ago

Update: If I add...

#include "cuda_wrapper_cupti.h"

...to the src/tracer/wrappers/CUDA/cuda_wrapper_cupti.c file, it almost compiles even with the unchanged CUPTI API check. Now the error is

/bin/sh ../../../../libtool  --tag=CC   --mode=compile cc -std=gnu11 -DHAVE_CONFIG_H -I. -I../../../..    -I../../../../src/common/MPI -I../../../../src/tracer -I../../../../src/tracer/hwc -I../../../../src/tracer/clocks -I../../../../src/tracer/interfaces/API -I../../../../src/tracer/wrappers/API -I../../../.. -I../../../../include -I../../../../src/common -I/workspace/destdir/include -I/workspace/destdir/cuda/include -I/workspace/destdir/cuda/extras/CUPTI/include  -g -O2 -fno-optimize-sibling-calls -Wall -W -c -o libwrap_cuda_la-cuda_wrapper_cupti.lo `test -f 'cuda_wrapper_cupti.c' || echo './'`cuda_wrapper_cupti.c
libtool: compile:  cc -std=gnu11 -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../src/common/MPI -I../../../../src/tracer -I../../../../src/tracer/hwc -I../../../../src/tracer/clocks -I../../../../src/tracer/interfaces/API -I../../../../src/tracer/wrappers/API -I../../../.. -I../../../../include -I../../../../src/common -I/workspace/destdir/include -I/workspace/destdir/cuda/include -I/workspace/destdir/cuda/extras/CUPTI/include -g -O2 -fno-optimize-sibling-calls -Wall -W -c cuda_wrapper_cupti.c  -fPIC -DPIC -o .libs/libwrap_cuda_la-cuda_wrapper_cupti.o
cuda_wrapper_cupti.c: In function ‘Extrae_DriverAPI_callback’:
cuda_wrapper_cupti.c:147:5: warning: passing argument 1 of ‘Extrae_cudaLaunch_Enter’ from incompatible pointer type [enabled by default]
     Extrae_cudaLaunch_Enter(p->f, p->hStream);
     ^
In file included from cuda_wrapper_cupti.c:44:0:
cuda_common.h:179:6: note: expected ‘const char *’ but argument is of type ‘CUfunction’
 void Extrae_cudaLaunch_Enter (const char*, cudaStream_t);
      ^
cuda_wrapper_cupti.c: In function ‘Extrae_RuntimeAPI_callback’:
cuda_wrapper_cupti.c:185:4: error: unknown type name ‘cudaLaunch_v3020_params’
    cudaLaunch_v3020_params *p =
    ^
cuda_wrapper_cupti.c:186:7: error: ‘cudaLaunch_v3020_params’ undeclared (first use in this function)
      (cudaLaunch_v3020_params*)cbinfo->functionParams;
       ^
cuda_wrapper_cupti.c:186:7: note: each undeclared identifier is reported only once for each function it appears in
cuda_wrapper_cupti.c:186:31: error: expected expression before ‘)’ token
      (cudaLaunch_v3020_params*)cbinfo->functionParams;
                               ^
cuda_wrapper_cupti.c:190:30: error: request for member ‘func’ in something not a structure or union
     Extrae_cudaLaunch_Enter(p->func, NULL);
                              ^
cuda_wrapper_cupti.c:299:6: warning: initialization from incompatible pointer type [enabled by default]
      (cudaMallocArray_v3020_params *)cbinfo->functionParams;
      ^
make[5]: *** [Makefile:851: libwrap_cuda_la-cuda_wrapper_cupti.lo] Error 1

I solve this by adding the following code in cuda_wrapper_cupti.h:

typedef struct cudaLaunch_v3020_params_st {
    const char *func;
} cudaLaunch_v3020_params;

And now it compiles.

0004-cuda-cupti-undefined-structs-since-v12.patch

emercadal commented 1 year ago

You are right, the problem was not only the version change but that we were also missing the declaration for cudaLaunch_v3020_params_st. We'll add your patch to the commit with the new solution. Thanks!

mofeing commented 1 year ago

You're welcome! Just pointing a couple of things:

  1. I kind of interpolated the definition of cudaLaunch_v3020_params_st between your old commented definition and the error messages. I don't know the real definition of cudaLaunch_v3020_params_st.
  2. CUDA v10.2 still fails. I guess that because of the CUPTI_API_VERSION check.