flux-framework / flux-core-v0.11

flux-core v0.11 stable branch
Other
2 stars 6 forks source link

[spectrum mpi] undefined symbol: PAMI_CUDA_RegisterPAMIContexts #11

Open garlick opened 5 years ago

garlick commented 5 years ago

When running an mpi hello world program under Flux on lassen, I get the following FATAL ERROR (the horror!) but my program still runs just fine. Also note the lua complaint. (line breaks added for readability)

$ flux wreckrun -ompi=spectrum  -n2 ./hello
2019-05-14T23:45:52.431872Z job.err[0]: job22: wrexecd says: spectrum.lua: rexecd_init:
    /g/g0/garlick/proj/flux-core-v0.11/src/modules/wreck/lua.d/spectrum.lua:17:
    attempt to concatenate local 'val' (a nil value)
FATAL ERROR: dlsym PAMI_CUDA_RegisterPAMIContexts: ./hello: undefined symbol:
    PAMI_CUDA_RegisterPAMIContexts
FATAL ERROR: dlsym PAMI_CUDA_RegisterPAMIContexts: ./hello: undefined symbol:
    PAMI_CUDA_RegisterPAMIContexts
0: completed MPI_Init in 0.150s.  There are 2 tasks
0: completed first barrier in 0.000s
0: completed MPI_Finalize in 0.030s

Flux was started locally on a login node (lassen708), I have a .notce environment, and this was run from source which git describes as v0.11.1.

dongahn commented 5 years ago

My guess is this is because Spectrum MPI dlopens libpami_cudahook.so. I suspect you can avoid this error by setting LD_PRELOAD to the path to libpami_cudahook.so. With Flux's current spectrum MPI support, this SO won't be used so this should be safe in theory. You should be able to find the libpami_cudahook.so path by looking at the environment variable under jsrun with an MPI program.

Without getting into too much detail, this is an ugly optimization technique that IBM used to allow their MPI to be able to send buffers allocated by CUDA memory allocation routines. The interception of the CUDA driver calls was achieved by wrapping dlsym in, libpami_cudahook.so, that is preloaded to each MPI process. But this has had lots, lots of issues, least of which was compatibility with both performance and debugging tools.

This will have to be revisited when @rountree is finishing up his PMIx work as PAMI will require this to be set correctly and we want support for tools at that point as well. I remember you could get a good mileage by putting libpami_cudahook.so as the last path in the LD_PRELOAD.

SteVwonder commented 5 years ago

Hmmm. This one boggles me. The spectrum.lua plugin does prepend /opt/ibm/spectrum_mpi/lib/libpami_cudahook.so to the LD_PRELOAD. [source code]. And that file seems to exist:

→ stat /opt/ibm/spectrum_mpi/lib/libpami_cudahook.so  
  File: ‘/opt/ibm/spectrum_mpi/lib/libpami_cudahook.so’ -> ‘libpami_cudahook.so.1’
  Size: 21          Blocks: 0          IO Block: 65536  symbolic link
Device: 901h/2305d  Inode: 6357621     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    1/     bin)   Gid: (    1/     bin)
Access: 2019-05-14 23:42:50.842844635 -0700
Modify: 2019-02-12 13:13:21.741949852 -0800
Change: 2019-02-12 13:13:21.741949852 -0800
 Birth: -

I have a .notce environment

I wonder if this has something to do with it. What happens if you run module use /usr/tcetmp/modulefiles/Core, then module load StdEnv, and then your login node flux instance + wreckrun? That should pull SpectrumMPI, the XL compiler, and most importantly Cuda into your environment:

→ module show StdEnv
<snip>
load("xl")
load("spectrum-mpi/rolling-release")
load("cuda")
dongahn commented 5 years ago

Hmmm. I think we need to find who defines PAMI_CUDA_RegisterPAMIContexts. From the symbol name of it, it looks like the PAMI library itself or its dependencies. Perhaps doing nm on the spectrum MPI directory suggests something?