NVIDIA / enroot

A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
Apache License 2.0
648 stars 94 forks source link

Auto install sudo #144

Open matyro opened 1 year ago

matyro commented 1 year ago

Hi, in our environment, most if not all people need interactive access to the container. For this reason, remap_root needs to be disabled to allow ssh access for e.g. IDEs. As soon as images are created manually, this is not a problem, but if some are to be used directly from the registry there are:

srun -c 1 -p GPU1 --gres=gpu:1 --container-image=nvcr.io/nvidia/cuda:12.0.0-devel-ubuntu22.04 --pty bash

I have no name!@ml2ran08:/$ apt install sudo
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  sudo
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 820 kB of archives.
After this operation, 2564 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 sudo amd64 1.9.9-1ubuntu2.1 [820 kB]
Fetched 820 kB in 0s (3820 kB/s)
Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/Debconf/Config.pm line 22.
debconf: delaying package configuration, since apt-utils is not installed
dpkg: error: requested operation requires superuser privilege
E: Sub-process /usr/bin/dpkg returned an error code (2)
I have no name!@ml2ran08:/$

Turns out that you can't install packages with normal user privileges because most images don't have sudo installed. Is there an easier way via slurm pre script or enroot hook to automatically install sudo for all containers and/or give the user the appropriate privileges to do so?

Cheers Dominik

3XX0 commented 1 year ago

sudo won't give you any extra privileges even if it were installed. Why do you have to run the container without remap_root on? I didn't get it, are you using SSHD in the container?

matyro commented 1 year ago

My goal is that one or more users can log into (via ssh) an enroot container started by slurm.

I had read that remap root does not work with SSH: https://github.com/NVIDIA/pyxis/issues/85

And in tests, got the error message myself. When installing ssh without remap root, I got the following error message:

Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/Debconf/Config.pm line 22.
debconf: delaying package configuration, since apt-utils is not installed
dpkg: error: requested operation requires superuser privilege
E: Sub-process /usr/bin/dpkg returned an error code (2)

And that's where I am now.

3XX0 commented 1 year ago

Yeah that's not going to work, the whole point of Enroot (and Pyxis) is to have containers be simple chroot for a given user, not a system container where multiple users can log in.

What you could do is start a separate jobstep in the job using the same container filesystem with srun --overlap --jobid ... --container-name .... This way your SSHD jobstep is not remapped while the second one is, thus you can install packages from there. That being said the whole thing will still be limited to the user running the job.

matyro commented 1 year ago

One user would be fine for now, this would allow connecting their tools / IDE into the container. If job steps are the way to go, I will need to check if those are scriptable to force dedicated ports for SSH.

matyro commented 1 year ago

Is it possible to switch inside a container with "--container-remap-root" from a root user to a normal user or the other way around with "--no-container-remap-root" that would allow user to install packages but execute ssh on user level without restarting the container.

3XX0 commented 1 year ago

No it needs a different process, using multiple srun with --container-name is pretty much free though