NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
263 stars 28 forks source link

Is there a single container instance per allocated node? #141

Closed ocaisa closed 3 weeks ago

ocaisa commented 3 weeks ago

It's not clear to me from the documentation if the plugin is running a single container instance per allocated node or not. I ask because we have a container that uses CernVM-FS and typically requires options like:

export SINGULARITY_BIND="${TMPDIR}/var-run-cvmfs:/var/run/cvmfs,${TMPDIR}/var-lib-cvmfs:/var/lib/cvmfs"

When executing an MPI program with singularity in this context we have had to do some gymnastics in the past to ensure that we have unique binds per MPI process, as this is what CVMFS expects (and you have a single instance of CVMFS per MPI process when using containers). If it can be guaranteed that there is only a single container instance per allocated node, then MPI codes would not have such a complication.

flx42 commented 3 weeks ago

Yes, by design all ranks on the same node will be running inside the same container: same root filesystem, same user / mount namespaces.

ocaisa commented 3 weeks ago

Great, thanks!