NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
282 stars 31 forks source link

sshd inside pyxis: chown(/dev/pts/0, 1000, 5) failed: Invalid argument #44

Closed flx42 closed 3 years ago

flx42 commented 3 years ago

Description

@lstuber and @3XX0 reported the following issue when trying to ssh to an openssh server running inside a pyxis container:

$ ssh -p 2222 localhost
Connection to localhost closed by remote host.
Connection to localhost closed.

From the terminal running sshd:

$ srun --container-image=ubuntu --container-name=ubuntu sh -c 'apt-get update && apt-get install -y openssh-server'
$ srun --container-name=ubuntu --no-container-remap-root sshd -d -p 2222
[...]
debug1: Allocating pty.
debug1: session_new: session 0
debug1: SELinux support disabled
chown(/dev/pts/0, 1000, 5) failed: Invalid argument
debug1: do_cleanup
[...]

sshd verifies it can drop privileges, so --no-container-remap-root is required.

Root cause

After allocating a new pty, sshd will try to chown it for the target user, this doesn't work under pyxis because we are inside a user namespace with only one mapped group, so the tty group isn't available. enroot doesn't have this problem because it always apply the seccomp filter to fake those calls, whereas pyxis only applies this filter when remapping the user as root.

Workaround

Use srun enroot start pyxis_ubuntu /usr/sbin/sshd -d -p 2222 instead.

Fix

Always apply the seccomp filter in pyxis, like enroot is doing.