From another terminal, we can't exec into this existing container:
$ srun --overlap --jobid=${JOBID} --container-name=ubuntu --pty bash
slurmstepd: error: pyxis: failed to set 2222 [listener] 0 of 10-100 startups: Bad argument
slurmstepd: error: pyxis: couldn't read container environment
Root cause
pyxis notices that the container named ubuntu is already running, so it uses this PID for the namespaces and the environment.
However, sshd modifies the name of the running process. On BSD setproctitle(3) is available, but on Linux it has to hack the content of argv and environ, so the procfs file /proc/<PID>/environ becomes invalid:
$ cat /proc/480286/environ
2222 [listener] 0 of 10-100 startupsp
So pyxis fails to import this file.
Workaround
Launch sshd below a sh process:
$ srun --container-name=ubuntu --no-container-remap-root sh -c '/usr/sbin/sshd -d -p 2222'
Description
@3XX0 reported the following issue:
From another terminal, we can't exec into this existing container:
Root cause
pyxis notices that the container named
ubuntu
is already running, so it uses this PID for the namespaces and the environment. However, sshd modifies the name of the running process. On BSDsetproctitle(3)
is available, but on Linux it has to hack the content ofargv
andenviron
, so the procfs file/proc/<PID>/environ
becomes invalid:So pyxis fails to import this file.
Workaround
Launch sshd below a
sh
process:It doesn't work with bash because of implicit execs.
Fix
Only use the existing PID for namespaces, always use a new
enroot start
to get the environment variables.