Actually I got an error with a Slurm script to run nemo_inference.sh but I simplified the situation to reproduce the same issue. I got the same error enroot-nsenter: failed to create user namespace: Invalid argument when I run enroot start with a user that has a large uid number (e.g. 4147706236).
enroot import docker://ubuntu
enroot create ubuntu.sqsh
enroot start ubuntu
enroot-nsenter: failed to create user namespace: Invalid argument
I also tried strace to debug it.
strace -o tmp.log enroot start ubuntu
enroot-nsenter: failed to create user namespace: Invalid argument
cat tmp.log
I found my uid 4147706236 was converted to negative number -147261060!
So that write(3, "-147261060 -147261060 1", 23) = -1 EINVAL (Invalid argument) failed because of the negative uid.
I realized that my uid was converted from uint32 to int32!
pyhton
>>> import numpy as np
>>> np.uint32(4147706236).astype(np.int32)
-147261060
Unfortunately, I can't change my uid to small number in my environment because of server administration reasons.
Could you please fix code around here to handle uint32 uid properly?
./enroot-check_3.4.1_x86_64.run --verify
Kernel version:
Linux version 5.15.0-1008-gcp-tcpx (buildd@lcy02-amd64-012) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #8-Ubuntu SMP Mon Apr 1 20:09:56 UTC 2024
Kernel configuration:
CONFIG_NAMESPACES : OK
CONFIG_USER_NS : OK
CONFIG_SECCOMP_FILTER : OK
CONFIG_OVERLAY_FS : OK (module)
CONFIG_X86_VSYSCALL_EMULATION : OK
CONFIG_VSYSCALL_EMULATE : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE : KO (required if glibc <= 2.13)
Kernel command line:
vsyscall=native : KO (required if glibc <= 2.13)
vsyscall=emulate : KO (required if glibc <= 2.13)
Kernel parameters:
kernel.unprivileged_userns_clone : OK
user.max_user_namespaces : OK
user.max_mnt_namespaces : OK
Extra packages:
nvidia-container-cli : OK
But enroot-check itself failed with the same error message.
./enroot-check_3.4.1_x86_64.run
Extracting [####################] 100%
enroot-nsenter: failed to create user namespace: Invalid argument
Hello! Thanks for making a great framework!
Recently I'm trying to run Nemotron-4-340B-Instruct and I found an issue about
enroot-nsenter
.Actually I got an error with a Slurm script to run
nemo_inference.sh
but I simplified the situation to reproduce the same issue. I got the same errorenroot-nsenter: failed to create user namespace: Invalid argument
when I runenroot start
with a user that has a large uid number (e.g.4147706236
).I also tried
strace
to debug it.I found my uid
4147706236
was converted to negative number-147261060
! So thatwrite(3, "-147261060 -147261060 1", 23) = -1 EINVAL (Invalid argument)
failed because of the negative uid.I realized that my uid was converted from uint32 to int32!
Unfortunately, I can't change my uid to small number in my environment because of server administration reasons. Could you please fix code around here to handle uint32 uid properly?
https://github.com/NVIDIA/enroot/blob/c7dc4a6b66e817af5ceb7d5315850696067b3f80/bin/common.h#L98
My environment is
enroot-check --verify looks OK.
But enroot-check itself failed with the same error message.