NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
282 stars 31 forks source link

sbatch + pyxis issue #58

Closed scorsair closed 3 years ago

scorsair commented 3 years ago

Hello everyone!

Guys, I faced the issue when I use sbatch with pyxis support. srun works perfectly but when I try to reproduce a sbatch example with containers I see: unrecognized option '--container-image'. What I did wrong?

slurm-20.11.8 nvslurm-plugin-pyxis-0.11.1-1 centos 7

Regards and thanks in advance!

scorsair commented 3 years ago

btw. Here is my pyxis conf: required /usr/lib64/slurm/spank_pyxis.so remap_root=1 execute_entrypoint=0 container_scope=global sbatch_support=1

scorsair commented 3 years ago

And I don't see any sbatch related options in the library:

strings /usr/lib64/slurm/spank_pyxis.so  | grep -c sbatch
0
flx42 commented 3 years ago

Hello @scorsair,

Sorry it's as bit surprising but to release 0.11.1 I created a branch starting from tag 0.11.0, cherry-picked 0ac2d7073fb347f5db42be38554a91f25d5d4cd4 and tagged as 0.11.1. This commit is important but I didn't want to release 0.12.0 yet.

So, if you want the sbatch support, you need to use the current master branch of the repository which has commit 68333337b8e198e9c720b626c890a27f62c60645.

scorsair commented 3 years ago

Thanks for explanation! It really works in 0.11.0.

flx42 commented 3 years ago

Yes it becomes a bit confusing if you create a deb package right now, as it will show version 0.11.0. If you install from sources it will show 0.12.0-dev in the slurmd log thanks to this change I did after seeing your issue: https://github.com/NVIDIA/pyxis/commit/0ceaf8bbeb96d8813c906a5911a871576430be8f