containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.54k stars 2.39k forks source link

runsc fails volume bind mount w/ "permission denied" using rootful Podman >= 5.2.0 and userns=auto (git bisected) #24311

Open BinaryKhaos opened 2 days ago

BinaryKhaos commented 2 days ago

Issue Description

I already reported this issue over @ runsc's issue tracker. but since it is somewhat unclear to me who is actually in the best position to at least fix/workaround this in the short term, I am also reporting it here. The following is copy & pasted from my runsc issue, so hopefully that's is okay.

With Podman commit c81f075f436466092372dec7a19c35fe387fe8d3 ("libpod: do not chmod bind mounts"), which is included in release 5.2.0-rc1 and above, runsc fails to bind mount volumes in certain cases with permission denied errors.

In my case, I have a custom container with an unpriviledged user that has several (partly nested) VOLUMEs defined in its BUILDFILE. I have the (local) volumes created with the appropriate sub(u|g)ids and run the container w/ userns=auto and mount the volumes accordingly. Everything in the container is run as the unpriviledged user.

This worked fine w/ runsc and Podman up to release 5.1.2. It fails w/ the 5.2 branch. It does work absolutely fine, though, with either runc or crunc, no matter what Podman version.

Steps to reproduce the issue

This is the most compact reproducer I could come up with.

Everything as root:

  1. Add "containers:100000:131072" to /etc/subuid and /etc/subgid
  2. podman volume create --opt o=uid=100001,gid=100001 bugtest-volume
  3. podman run --userns=auto:size=65536 -v bugtest-volume:/home/bugtest --runtime=runsc --rm -it alpine sh -c "ls -ln /home"

Describe the results you received

This will cause a permission denied error on runsc's side with Podman >= 5.2.0-rc1.

Describe the results you expected

With crun/runc, you will see the correct directory listing:

total 4
drwxr-xr-x    2 1        1             4096 Oct 15 05:53 bugtest

podman info output

host: arch: amd64 buildahVersion: 1.37.4 cgroupControllers:

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

eriksjolund commented 2 days ago

I tried your reproducer on Fedora CoreOS 41.20241006.1.1 (but not for scrun I just tried crun)

When running step 3

# podman --runtime crun run --userns=auto:size=65536  -v bugtest-volume:/home/bugtest --rm -it alpine sh -c "ls -ln /home"
total 0
drwxr-xr-x    2 65534    65534            6 Oct 17 07:31 bugtest

I also tried the idmap option

# podman --runtime crun run --userns=auto:size=65536 --mount 'type=volume,src=bugtest-volume,dst=/home/bugtest,idmap=uids=@100001-1-1;gids=@100001-1-1' --rm -it alpine sh -c "ls -ln /home"
total 0
drwxr-xr-x    2 1        1                6 Oct 17 07:31 bugtest
BinaryKhaos commented 1 day ago

If you followed the steps to the letter, you should see the described outcome. I just verified it again, to be sure. I was not able to find a packages list for Fedora CoreOS, so I have no clue what crun and podman version you are using on what kernel and so forth. But with podman >= 5.2.0-rc1 (and git main), you will see the failure with runsc (not scrun). Both runc (without the s) and crun will work fine, showing the correct uid/gid. Also, isn't Fedora using SELinux by default? I don't know how much this could influence the result as well.

Since I totally forgot I could use idmap with rootful podman (previously only used rootless), I will see if I can workaround this problem that way and report back.

BinaryKhaos commented 1 day ago

Ok, tested with idmapped mounts and even though they work perfectly fine with crun and runc to achieve the same results I want, it still fails, as expected, for runsc with a permission denied error due to the changes in https://github.com/containers/podman/commit/c81f075f436466092372dec7a19c35fe387fe8d3. Since runsc does not use the new mount api, that change effectively limits what can be done with podman and runsc. I have no chance to test this with Kata Containers at the moment, but I bet those are effected as well.

baude commented 1 day ago

@giuseppe can you read this and offer an opinion ?

giuseppe commented 1 day ago

that is runsc not using the new mount API.

https://github.com/containers/podman/commit/c81f075f436466092372dec7a19c35fe387fe8d3 changes the requirement, so we don't loosen up the directory permissions, but expect the OCI runtime to handle that using the new mount API. I'd prefer that we don't revert the change because it improves security for crun/runc users.

One way to circumvent the issue is to change the directory permissions so they are usable from the user namespace, alternatively, you could create a bind mount and use that for the volume source.