NVIDIA / enroot

A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
Apache License 2.0
649 stars 94 forks source link

`enroot-nsenter: failed to create user namespace: Invalid argument` with a user that has a large uid number #195

Closed susumuota closed 5 months ago

susumuota commented 5 months ago

Hello! Thanks for making a great framework!

Recently I'm trying to run Nemotron-4-340B-Instruct and I found an issue about enroot-nsenter.

Actually I got an error with a Slurm script to run nemo_inference.sh but I simplified the situation to reproduce the same issue. I got the same error enroot-nsenter: failed to create user namespace: Invalid argument when I run enroot start with a user that has a large uid number (e.g. 4147706236).

enroot import docker://ubuntu
enroot create ubuntu.sqsh
enroot start ubuntu
enroot-nsenter: failed to create user namespace: Invalid argument

I also tried strace to debug it.

strace -o tmp.log enroot start ubuntu
enroot-nsenter: failed to create user namespace: Invalid argument
cat tmp.log

I found my uid 4147706236 was converted to negative number -147261060! So that write(3, "-147261060 -147261060 1", 23) = -1 EINVAL (Invalid argument) failed because of the negative uid.

execve("/usr/bin/enroot-nsenter", ["enroot-nsenter", "--user", "--mount", "/usr/bin/bash", "--norc", "-o", "braceexpand", "-o", "errexit", "-o", "hashall", "-o", "interactive-comments", "-o", "nounset", "-o", "pipefail", "-O", "checkwinsize", "-O", "cmdhist", "-O", "complete_fullquote", "-O", "extquote", "-O", "force_fignore", "-O", "globasciiranges", "-O", "hostcomplete", "-O", ...], 0x5590c45b8230 /* 203 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x145e10004b58) = 0
set_tid_address(0x145e10004d4c)         = 196522
prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_IS_SET, CAP_CHOWN, 0, 0) = 0
getegid()                               = 4147706236
getegid()                               = 4147706236
brk(NULL)                               = 0x5555555ee000
brk(0x5555555f0000)                     = 0x5555555f0000
mmap(0x5555555ee000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x5555555ee000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x145e0fff5000
geteuid()                               = 4147706236
geteuid()                               = 4147706236
unshare(CLONE_NEWUSER)                  = 0
open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
write(3, "deny", 4)                     = 4
close(3)                                = 0
open("/proc/self/gid_map", O_WRONLY|O_LARGEFILE) = 3
write(3, "-147261060 -147261060 1", 23) = -1 EINVAL (Invalid argument)
close(3)                                = 0
munmap(0x145e0fff5000, 4096)            = 0
writev(2, [{iov_base="enroot-nsenter: ", iov_len=16}, {iov_base=NULL, iov_len=0}], 2) = 16
writev(2, [{iov_base="failed to create user namespace", iov_len=31}, {iov_base=NULL, iov_len=0}], 2) = 31
writev(2, [{iov_base="", iov_len=0}, {iov_base=": ", iov_len=2}], 2) = 2
writev(2, [{iov_base="", iov_len=0}, {iov_base="Invalid argument", iov_len=16}], 2) = 16
writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2) = 1
exit_group(1)                           = ?
+++ exited with 1 +++

I realized that my uid was converted from uint32 to int32!

pyhton
>>> import numpy as np
>>> np.uint32(4147706236).astype(np.int32)
-147261060

Unfortunately, I can't change my uid to small number in my environment because of server administration reasons. Could you please fix code around here to handle uint32 uid properly?

https://github.com/NVIDIA/enroot/blob/c7dc4a6b66e817af5ceb7d5315850696067b3f80/bin/common.h#L98

My environment is

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal
apt list --installed | grep enroot
enroot+caps/now 3.4.1-1 amd64 [installed,local]
enroot/now 3.4.1-1 amd64 [installed,local]

enroot-check --verify looks OK.

./enroot-check_3.4.1_x86_64.run --verify
Kernel version:

Linux version 5.15.0-1008-gcp-tcpx (buildd@lcy02-amd64-012) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #8-Ubuntu SMP Mon Apr 1 20:09:56 UTC 2024

Kernel configuration:

CONFIG_NAMESPACES                 : OK
CONFIG_USER_NS                    : OK
CONFIG_SECCOMP_FILTER             : OK
CONFIG_OVERLAY_FS                 : OK (module)
CONFIG_X86_VSYSCALL_EMULATION     : OK
CONFIG_VSYSCALL_EMULATE           : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE            : KO (required if glibc <= 2.13)

Kernel command line:

vsyscall=native                   : KO (required if glibc <= 2.13)
vsyscall=emulate                  : KO (required if glibc <= 2.13)

Kernel parameters:

kernel.unprivileged_userns_clone  : OK
user.max_user_namespaces          : OK
user.max_mnt_namespaces           : OK

Extra packages:

nvidia-container-cli              : OK

But enroot-check itself failed with the same error message.

./enroot-check_3.4.1_x86_64.run
Extracting [####################] 100%
enroot-nsenter: failed to create user namespace: Invalid argument
susumuota commented 5 months ago

PR #197 merged. Thank you @3XX0 !