mfhepp commented 8 months ago

If a script wants to write to the current working directory on the host system, an obvious way is to use a bind mount to map a directory on the host to a directory inside the container.

Unfortunately, this does not work with Docker on Linux systems; the non-root mambauser cannot write to directories from bind mounts, no matter if we set the UID/GID to that of the user on the host or not:

# Minimal example (works on Docker Desktop on OSX)
$ docker run --rm -it -v "$(pwd):/tmp" \
   mambaorg/micromamba:1.5.6 /bin/bash
$ id
uid=57439(mambauser) gid=57439(mambauser) groups=57439(mambauser)
$ touch test.txt
touch: cannot touch 'test.txt': Permission denied
$ touch /tmp/test.txt
touch: cannot touch '/tmp/test.txt': Permission denied

# With using the host user's UID and GID
$ docker run --rm -it --user $UID:$GID -v "$(pwd):/home/mambauser"  \
    mambaorg/micromamba:1.5.6 /bin/bash
$ cd /home/mambauser/
$ echo test > test.md
bash: test.md: Permission denied
$ ls -la
total 8
drwxr-xr-x 2 root root 4096 Jan 11 03:15 .
drwxrwxrwx 3 root root 4096 Dec 30 15:30 ..
$ pwd
/home/mambauser

Writing to Docker bind volumes on Linux systems as non-root users is a well-known and complicated topic, but I wonder if there is an elegant way of adding the mambauser to the group that has write access to a bind mount point.

Or is there any other way of writing to a directory on the host from the mambauser?

Note: The issue does not appear on Docker Desktop for OSX, as the built-in VM maps between the host system and the Docker environment.

References:

Addendum: Tested with Micromamba:1.5.6, Docker version 24.0.7, build afdd53b on Debian 11.8 Docker version 24.0.7, build afdd53b

mfhepp commented 8 months ago

I also tried building the Docker image locally with the host user UID, GID and username included in the build process as mycromamba, to no avail:

# Build micromamba-docker:1.5.6 locally
docker build . -t mycromamba --build-arg="MAMBA_USER=$USER" \
  --build-arg="MAMBA_USER_ID=$(id -u)" \
  --build-arg="MAMBA_USER_GID=$(id -g)"

$ docker run --rm -it --user $UID:$GID -v "$(pwd):/home/foobar/data" mycromamba /bin/bash
$ cd /home/foobar/data/
$ ls -la
total 8
drwxr-xr-x 2 root  root  4096 Jan 11 03:15 .
drwxrwxrwx 3 foobar foobar 4096 Jan 11 03:38 ..
# Writing to bind mount /home/foobar/data fails:
$ touch test.txt
touch: cannot touch 'test.txt': Permission denied
# Writing to user directory /home/foobar *inside the container* works (but is not bound to host)
$ cd ..
$ touch test.txt
$ ls
data  test.txt

wholtz commented 8 months ago

Works for me:

$ docker run -it --rm -v $(pwd):/home/mambauser  -u $(id -u):$(id -g) mambaorg/micromamba:1.5.6 /bin/bash -c 'touch /home/mambauser/foobar'
$ docker version
Client: Docker Engine - Community
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:32:16 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:32:16 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.24
  GitCommit:        61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
 runc:
  Version:          1.1.9
  GitCommit:        v1.1.9-0-gccaecfc
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
$ cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

On the host, do you have write permission to the working directory where you are executing the docker run ... command?

this:

drwxr-xr-x 2 root  root  4096 Jan 11 03:15 .

seems to indicate that you do not have write permissions in that directory on the host.

wholtz commented 8 months ago

Just upgraded docker to the same version you tried, and I still cannot reproduce it when in a host directory where I have write access.

$ docker version
Client: Docker Engine - Community
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.20.10
 Git commit:        afdd53b
 Built:             Thu Oct 26 09:08:17 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.10
  Git commit:       311b9ff
  Built:            Thu Oct 26 09:08:17 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.26
  GitCommit:        3dd1e886e55dd695541fdcd67420c2888645a495
 runc:
  Version:          1.1.10
  GitCommit:        v1.1.10-0-g18a0cb0
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

maresb commented 8 months ago

Works fine for me too:

$ docker run --rm -it -v "$(pwd):/tmp" --user $UID:$GID mambaorg/micromamba:1.5.6 /bin/bash
(base) I have no name!@9b2eefcffcd4:/tmp$ touch test && ls
test

mfhepp commented 8 months ago

Thanks for looking into this! Please note that I am using a rootless Docker installation on the host machine, as is best practice; this may have an effect-

For my py4docker project, I was able to solve the issue as follows:

Creating a dedicated, non-root and non-sudo user on the host machine. The user in my experiments was member of the sudo group.
My new user has a UID and GID of 1000, the new one has 1002. I think I read somewhere that only UIDs from 1001 onwards work for bind volume mounting.
Made sure that when running the Docker daemon in rootless mode, you set the proper CLI context: docker context use rootless

Without a lot of effort, I cannot tell exactly if the UID of 1000 or the sudo group membership of the user on the host cause this behavior, but the cause is most likely one of the two. The CLI context is a requirement to access the Docker daemon, but does not fix the issue (I can still reproduce it by using the UID 1000 user with sudo rights).

Again, thanks for your time and support!

PS: The problem seems to have a long history, hence I add a link to resources and findings on the topic of Docker bind mounts here for future visitors to this issue.

mfhepp commented 8 months ago

Works fine for me too:

$ docker run --rm -it -v "$(pwd):/tmp" --user $UID:$GID mambaorg/micromamba:1.5.6 /bin/bash
(base) I have no name!@9b2eefcffcd4:/tmp$ touch test && ls
test

In this test, it is IMO not 100% certain if the write operation takes place inside the mount point or elsewhere (it depends on the WORKDIR). mambauser can write to many places inside the container, as long as the OS is not set to read-only. In my project, I take quite some effort to minimize persistent, harmful actions by the application code or its dependencies.

Edit: I checked in interactive mode; this indeed works if my user has no sudo membership nor a UID of 1000.

mfhepp commented 8 months ago

One small addendum:

Bash does not seem to set the $GID enviroment variable, which leads to a GID of 0 and root group membership.
So it is better to use --user $(id -u):$(id -g) than --user $UID:$GID:

With --user $(id -u):$(id -g):

docker run --rm -it -v "$(pwd):/tmp" --user $(id -u):$(id -g) mambaorg/micromamba:1.5.6 /bin/bash
(base) I have no name!@8b30d9644121:/tmp$ id
uid=1000 gid=1000 groups=1000

With --user $UID:$GID:

docker run --rm -it -v "$(pwd):/tmp" --user $UID:$GID mambaorg/micromamba:1.5.6 /bin/bash
(base) I have no name!@1433c5a8d78d:/tmp$ id
uid=1000 gid=0(root) groups=0(root)

batterseapower commented 8 months ago

I also find that the --user approach does allow me to access mounts in our rootless podman setup. However, --userns=keep-id:uid=1000,gid=1000 is a working alternative.

(Note that micromamba 1.5.0 and later use UID/GID 57439 instead of 1000, so you need to modify that to --userns=keep-id:uid=57439,gid=57439)

mfhepp commented 5 months ago

For others running into similar problems, this tool might be useful:

https://www.joyfulbikeshedding.com/blog/2023-04-20-cure-docker-volume-permission-pains-with-matchhostfsowner.html

mfhepp commented 5 months ago

Still working on the issue... and I think there is an underlying problem that surfaces when you are using micromamba-docker

with the plain Docker daemon (not Docker Desktop),
on a Linux machine
in rootless mode.

As far as I understand, the non-root mambauser conflicts with the way Docker in rootless mode deals with bind mounts and namespace mapping in general.

Docker in rootless mode utilizes rootlesskit, which maps user namespaces between the host system and the system inside the container.

See how rootlesskit changes the username, UID, and GID of resources owned by a local user:

# Create a test folder and file
# Default permissions set by umask
$ mkdir foo
$ cd foo
$ mkdir bar
$ touch mytext.txt
# Show permissions from the host user's perspective
$ ls -la
total 12
drwxr-xr-x  3 myusername myusername 4096 Apr 30 19:23 .
drwxr-xr-x 15 myusername myusername 4096 Apr 30 19:23 ..
drwxr-xr-x  2 myusername myusername 4096 Apr 30 19:23 bar
-rw-r--r--  1 myusername myusername    0 Apr 30 19:23 mytext.txt
$ id
uid=1000(myusername) gid=1000(myusername) groups=1000(myusername),996(docker)
# Now turning on RootlessKit, as used by Docker
# RootlessKit creates user_namespaces and executes newuidmap/newgidmap along with subuid and subgid
# https://github.com/rootless-containers/rootlesskit
$ rootlesskit bash
root@110:~/foo# ls -la
total 12
drwxr-xr-x  3 root root 4096 Apr 30 19:23 .
drwxr-xr-x 15 root root 4096 Apr 30 19:23 ..
drwxr-xr-x  2 root root 4096 Apr 30 19:23 bar
-rw-r--r--  1 root root    0 Apr 30 19:23 mytext.txt
root@110:~/foo# id
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
root@110:~/foo# exit

You can see that rootlesskit changes the owning user and user group from myusername to root and maps the UID from 1000 to 0.

This means that the non-root mambauser cannot access such files and folders, even if the UID/GID of the local user is passed to the container when Docker is run in the rootless mode (for which there are good reasons).

The exact mapping is determined by /etc/subuid and /etc/subgid.

I am reopening the issue, because

I have been unable to fix this despite considerable effort and
the WWW is full of unsatisfying solutions for the underlying problem.

It may be possible to fix this by

configuring /etc/subuid and /etc/subgid, or modifying
the _entrypoint.sh shell script or
the Dockerfile.

Any ideas will be very much appreciated!

PS: The test from earlier on fails now, too, since I changed my system to a rootless Docker configuration:

$ docker run --rm -it -v "$(pwd):/tmp" --user $(id -u):$(id -g) mambaorg/micromamba:1.5.8 /bin/bash
(base) I have no name!@a84cfa7e0ace:/tmp$ touch test
touch: cannot touch 'test': Permission denied
(base) I have no name!@a84cfa7e0ace:/tmp$ id
uid=1000 gid=1000 groups=1000
(base) I have no name!@a84cfa7e0ace:/tmp$ ls -la
total 12
drwxr-xr-x  3 root root 4096 Apr 30 19:23 .
drwxr-xr-x 17 root root 4096 Apr 30 19:54 ..
drwxr-xr-x  2 root root 4096 Apr 30 19:23 bar
-rw-r--r--  1 root root    0 Apr 30 19:23 mytext.txt

As shown above with rootlesskit alone, despite the indicated GID and UID being identical, the user namespace mechanism changes the owner to root from inside the container:

# Permissions from the host user's perspective
$ ls -la
total 12
drwxr-xr-x  3 myusername myusername 4096 Apr 30 19:23 .
drwxr-xr-x 15 myusername myusername 4096 Apr 30 19:23 ..
drwxr-xr-x  2 myusername myusername 4096 Apr 30 19:23 bar
-rw-r--r--  1 myusername myusername    0 Apr 30 19:23 mytext.txt
$ id
uid=1000(myusername) gid=1000(myusername) groups=1000(myusername),996(docker)

maresb commented 5 months ago

Wow, thanks @mfhepp for your persistence in getting to the bottom of this issue! This looks like it was quite some effort.

I have not yet used rootless mode myself (although I probably should), so this is a bit beyond my current comfort level. Thus I don't have a preference among your suggested approaches for a fix. Perhaps @wholtz has some thoughts?

mfhepp commented 5 months ago

Have been doing additional experiments today - here is my current summary:

Essentially, there are at least four solutions:

Option 1: Make mambauser root with --user root

I think the most pragmatic way to deal with this is to use --user root if, and only if

Docker is running in rootless mode
the host user is not root
the host user has no sudo rights.

Then mambauser will internally be root, but thanks to rootlesskit etc., this will be mapped to the host user running the container (eg. uid 1001).

This should work and is IMO quite safe. The user is not root on the host.

A similar approach has been added to syzkaller.

Downsides:

mambauser has root access inside the container and could hence to quite some stuff (e.g. accessing other users, creating new ones. One potential risk is that it could create additional users inside the container that are crafted to be mapped onto existing ones on the host (Example: There is a user with uid=3000 with sudo rights on the host, we create a new user with uid=1500 inside the container - if /etc/subuid/ allows mapping to uid 3000..., then we can do what that user can do on the host. I did not test this, but think it is possible.).
We need to make sure that the three conditions from above are safely met. If we run the same container on a system with a regular "root" Docker daemon and/or the host user has root privileges, it can do much damage.

Credits:

https://sthbrx.github.io/blog/2023/04/05/detecting-rootless-docker/ - thanks, @sthrbx!

In my py4docker project, I will likely add this to run_script.sh as a quick and portable fix with checks for the aforementioned conditions. As for micromamba-docker, it is probably best to add a section to the documentation. Again, the WWW is full of desperate ;-) people facing similar problems with volume mounts, non-root users, and rootless Docker ;-). My fear is most of them will simply give up and run everything as root :-(.

See also my answer in https://stackoverflow.com/a/78412890/516699.

Option 2: Mimic --userns=keep-id on Docker by tweaking /etc/subuid and using a dedicated non-root user on the host (likely the best approach, not yet tested)

I think I do now understand what happens when rootlesskit uses /etc/subuid (and ...gid for the group accordingly):

root@container -> user@host: UID 0 inside the container is mapped to the uid of the user running the container on the host (Example: your UID is
UIDs 1 ...@container are mapped based on the settings in /etc/subuid for the host user.

Example: /etc/subuid contains

user:10000:65536

This says that for the user user on the host, all UID > 0 can / will be mapped to the UID range from 10000 to 10000+65536, like so

UID on container --> UID on host
1 -> 10000
2 -> 10001
...

You can debug the mapping rules from container to host like so :

$ rootlesskit bash
# cat /proc/self/uid_map 
         0       1000          1
         1     100000      65536
# cat /proc/self/gid_map 
         0       1000          1
         1     100000      65536

Possible Solution:

Create a non-root user mambauser_host on the host for running rootless Docker and the micromamba-docker image with a UID of 57439 (the internal UID of mambauser). It could be any other sufficiently large UID, but this way it's more transparent.
```
sudo groupadd -g 57439 mambauser_host
sudo useradd mambauser_host -u 57439 -g 57439 -m -s /bin/bash
sudo paddwd mambauser_host
```
Install and start rootless Docker for this user.
Craft /etc/subuid and /etc/subgid so that the UID and GUID of mambauser inside the container will map to those on the host.

This requires a bit of thinking ;-): We want the UID on the host to be 57439. The UID inside the container should be > 1000, e.g. 1002. In order to map 1002 in the container to 57439 on the host for the user mambauser_host, we need to add

mambauser_host:56437:1002

to /etc/subuid and /etc/subgid on the host system (not your workstation if using ssh!):

sudo nano /etc/subuid
# Insert this line
mambauser_host:56437:1002
# Save and exit
sudo nano /etc/subgid
# Insert this line
mambauser_host:56437:1002
# Save and exit
# Restart system for the changes to take effect
# There might be a more elegant way

Notes:

The value 1002 is the number of consequent IDs allowed for mapping. Typically, 65536 is used, but this would allow mapping more user IDs than needed. If you use a container UID/GID > 1002, you have to increase this value.
I fixed the format (using colons instead of commas).

If we then run the container with

docker run --rm -it -v "$(pwd):/tmp" --user 1002:1002 mambaorg/micromamba:1.5.8 /bin/bash

the UIDs starting from 1 will be mapped to the UID range starting at 56437, like to

UID 1 -> 56437
UID 2 -> 56438
...
UID 1002 -> 57439

Et voilà!

I have not yet tested this, but wanted to save the intermediate status. **This should basically do what --userns=keep-id:uid=57439,gid=57439 would do on Podman.

Details and Credits: https://rootlesscontaine.rs/how-it-works/userns/

Option 3: Use Podman with --userns=keep-id:uid=57439,gid=57439

# Note that we do not use the --user option, so mambauser remains 57439:57439!
podman run --rm -it -v "$(pwd):/tmp" --userns=keep-id:uid=57439,gid=57439 mambaorg/micromamba:1.5.8 /bin/bash

This is untested as I have no Podman installation at hand, but likely the cleanest approach.

Option 4: Disable user namespace with Docker in rootless mode

In theory, it might be possible to use --userns=host with Docker and disable the user namespace mechanism, like so:

docker run --rm -it -v "$(pwd):/tmp" --userns=host mambaorg/micromamba:1.5.8 /bin/bash

But it does not work, at least on my test server, and there are a bunch of Docker issues related to this, e.g. this one.

It might also introduce new security issues.

Hope you find this useful :-)! The problem has for long been a huge blocker in my workflow and I would have loved to avoid digging so deeply into this ;-). And of course, the underlying challenges are not specific to micromamba-docker; it's just that in here the non-root user makes the default recipes for mounting host volumes for write-access with a rootless Docker installation fail.

mfhepp commented 5 months ago

Addendum:

Instead of editing subuid and subgid with an editor, you can better usermod like so (from here):

sudo usermod --add-subuids <start>-<end> <username>

The proposed solution requires changes on the host system (new non-root user and setting subuid and subgid. This makes it a bit less portable - but then again, it is conceptually clean and IMO the most secure option despite using Podman.

mfhepp commented 5 months ago

Update:

My preferred Option 2 (via subuid) does not seem to work, because at least in Debian, there is a conflict when mapping the container user to the UID of the host user.
It may be worthwhile to look into how Podman is implementing --userns=keep-id in its source-code.

mfhepp commented 5 months ago

Solution and Conclusion

1. The best solution is to install Podman, which is super-simple, and to use the --userns=keep-id option. As the additional uid/gid parameters described in the current version are not available in e.g. Podman 3.0.1, you best use this command:

podman run --rm -it -v "$(pwd):/tmp" --userns=keep-id --user $(id -u):$(id -g) docker.io/mambaorg/micromamba:1.5.8 /bin/bash

In there, touch testfile.txt works like a charm, and the external username and id is also visible from inside.

2. The second-best solution is to use --user root when starting the container *if and only if Docker is running in rootless mode and if the host user has no sudo rights.***. This can be a quick fix with acceptable risks in most cases.

docker run --rm -it -v "$(pwd):/tmp" --user root mambaorg/micromamba:1.5.8 /bin/bash

touch testfile.txt works, too, and the container root user is mapped to the local user on the host system.

The other options do not work; tweaking the subuid and subgid mapping does not work without brittle and intransparent modifications on the host system.

Hope this is useful for many of you! Frankly, it was a nightmare to get to this solution and insight and I learned many things that are intellectually interesting but were not on my bucket-list to master ;-).

Most important to say: micromamba-docker is not all all to blame that it was such a painful journey; it is Docker's insufficient documentation and the lack of --userns=host and/or --userns=keep-id support; deeply hidden in Docker issues and other sources. The hundreds of related posts and discussions indicates that many, many others are wasting their life-time with this Docker problem. So please spread the word.

TODO: Add summary and pointer to this to the documentation.

wholtz commented 5 months ago

@mfhepp thanks for the detailed investigation and clear recommendations. I am certainly in support of increasing our documentation in this area. Do you have any interest in putting together a documentation PR?

mfhepp commented 5 months ago

@mfhepp thanks for the detailed investigation and clear recommendations. I am certainly in support of increasing our documentation in this area. Do you have any interest in putting together a documentation PR?

Much appreciated! Will try to send a PR when I will have done my homework on py4docker!

mamba-org / micromamba-docker

mambauser cannot write to bind mounts on Linux with Docker in rootless-mode #407

Solution and Conclusion