NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
273 stars 31 forks source link

make deb undefined symbol: slurm_spank_log #35

Closed microbioticajon closed 3 years ago

microbioticajon commented 3 years ago

Hi Guys,

I am trying to build and install pyxis on my ubuntu 20.04 based machine and am running into issues.

With the following apt packages slurm-wlm libslurm-dev devscripts debhelper

I build pyix v0.9.1 as a deb package and install as per the installation instructions. However, once I do this slurmd refuses to start:

sudo systemctl restart slurmd
Job for slurmd.service failed because the control process exited with error code.
See "systemctl status slurmd.service" and "journalctl -xe" for details.

/var/log/slurm-llnl/slurmd.log

...
[2021-02-18T00:15:28.112] error: plugin_load_from_file: dlopen(/usr/lib/x86_64-linux-gnu/slurm/spank_pyxis.so): /usr/lib/x86_64-linux-gnu/slurm/spank_pyxis.so: undefined symbol: slurm_spank_log
[2021-02-18T00:15:28.112] error: spank: /usr/lib/x86_64-linux-gnu/slurm/spank_pyxis.so: Dlopen of plugin file failed
[2021-02-18T00:15:28.112] error: spank: /etc/slurm-llnl/plugstack.conf.d/pyxis.conf:1: Failed to load plugin /usr/lib/x86_64-linux-gnu/slurm/spank_pyxis.so. Aborting.
[2021-02-18T00:15:28.112] error: slurmd initialization failed

I have tried building from source and get the following warning:

sudo make  clean install
rm -rf common.o args.o pyxis_slurmstepd.o pyxis_slurmd.o pyxis_srun.o pyxis_dispatch.o config.o enroot.o common.d args.d pyxis_slurmstepd.d pyxis_slurmd.d pyxis_srun.d pyxis_dispatch.d config.d enroot.d spank_pyxis.so
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF common.d -c common.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF args.d -c args.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF pyxis_slurmstepd.d -c pyxis_slurmstepd.c
pyxis_slurmstepd.c: In function ‘enroot_container_create’:
pyxis_slurmstepd.c:528:3: warning: implicit declaration of function ‘slurm_spank_log’; did you mean ‘slurm_spank_exit’? [-Wimplicit-function-declaration]
  528 |   slurm_spank_log("pyxis: importing docker image ...");
      |   ^~~~~~~~~~~~~~~
      |   slurm_spank_exit
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF pyxis_slurmd.d -c pyxis_slurmd.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF pyxis_srun.d -c pyxis_srun.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF pyxis_dispatch.d -c pyxis_dispatch.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF config.d -c config.c
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic  -D_GNU_SOURCE -D_FORTIFY_SOURCE=2  -MMD -MF enroot.d -c enroot.c
cc -shared -Wl,-znoexecstack -Wl,-zrelro -Wl,-znow  -o spank_pyxis.so spank_pyxis.lds common.o args.o pyxis_slurmstepd.o pyxis_slurmd.o pyxis_srun.o pyxis_dispatch.o config.o enroot.o
strip --strip-unneeded -R .comment spank_pyxis.so
install -d -m 755 /usr/local/lib/slurm
install -m 644 spank_pyxis.so /usr/local/lib/slurm
install -d -m 755 /usr/local/share/pyxis
echo 'required /usr/local/lib/slurm/spank_pyxis.so' | install -m 644 /dev/stdin /usr/local/share/pyxis/pyxis.conf

Im not sure where to go from here :-/ Best, Jon

lukeyeager commented 3 years ago

You need Slurm >= 20.02. From the wiki:

Requirements: Slurm version 20.02 or later. https://github.com/NVIDIA/pyxis/wiki/Installation

flx42 commented 3 years ago

The latest releases require Slurm 20.02 indeed, but if you want to quickly test pyxis you can compile tag version 0.7.0 and it should work with older versions of Slurm.

microbioticajon commented 3 years ago

...... that was the one thing I didn't check.....

Many thanks @lukeyeager, @flx42 for the fast reply - apologies for generating noise.