cwpearson / tempi

Topology Experiments for MPI
Other
10 stars 4 forks source link

Darshan gets MPI_Init before libtempi #1

Open cwpearson opened 4 years ago

cwpearson commented 4 years ago

Summit wants to find MPI_Init in darshan (jsrun -E LD_DEBUG=symbols).

symbol=MPI_Init;  lookup in file=bin/bench-mpi-pack [0]
     68381:     symbol=MPI_Init;  lookup in file=/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/darshan-runtime-3.1.7-cnvxicgf5j4ap64qi6v5gxp67hmrjz43/lib/libdarshan.so [0]

Darshan is not explicitly included in the link step when building, so somehow it is injected at runtime. In any case, we can fix this by module unload darshan, so then our MPI_Init happens right after libpami_cudahook.so. Later, the lazy lookup will cause it to happen in libmpiprofilesupport.so.3 and then libmpi_ibm.so.3.

How to prevent libdarshan from taking this over?

cwpearson commented 3 years ago

Alternatively, it may be possible to prefer loading MPI_Init (and any other functions?) from libdarshan.so before libmpi.so

cwpearson commented 2 years ago

jsrun sometimes uses OMPI_LD_PRELOAD_PREPEND to add libraries at runtime. We can probably do this for libtempi.so as well.