NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
263 stars 28 forks source link

pyxis plug-in components do not effective #95

Closed xinyx62 closed 1 year ago

xinyx62 commented 1 year ago

Hi I am install slurm on ubuntu with the command:
./configure --prefix=/opt/slurm/22.05.5 --sysconfdir=/opt/slurm/22.05.5/etc make -j && make install ldconfig -n /opt/slurm/22.05.5/lib/

Now install pyxis with deb pakage can succeed, and mkdir /opt/slurm/22.05.5/etc/plugstack.conf.d ln -s /usr/share/pyxis/pyxis.conf /opt/slurm/22.05.5/etc/plugstack.conf.d/pyxis.conf

And the slurmd failed to restart. While run "strace -e openat srun --help >/dev/null" , the output show:

image

I have check the pyxis lib, the version is 0.14. image

Do you can help with problem?

flx42 commented 1 year ago

The incompatible Slurm plugin message is because you need to recompile pyxis against the spank.h version of Slurm you are going to use. This is due to a breaking change in Slurm 21.08: https://github.com/SchedMD/slurm/blob/slurm-21-08-8-2/RELEASE_NOTES#L119-L121

rvencu commented 1 year ago

Would you care to update the installation doc with an example of compiling correctly in this case?

flx42 commented 1 year ago

It's now mentioned in the README: https://github.com/NVIDIA/pyxis/commit/7b96031079bd9e419f01fd0398493a0690750d98 But it's hard to give "an example of correctly" as it will depend on your build environment, but let me know if you have a suggestion for documenting that.

rvencu commented 1 year ago

I got the file's path; how to specify it at the make command? /opt/slurm/include/slurm/spank.h

flx42 commented 1 year ago

This should work:

$ CFLAGS="-I /opt/slurm/include" make