tianon / gosu

Simple Go-based setuid+setgid+setgroups+exec
Apache License 2.0
4.71k stars 320 forks source link

user switching failing with user namespace mapping #64

Closed thdesy closed 5 years ago

thdesy commented 5 years ago

Hi all,

gosu fails to switch the user in a Docker container when running dockerd with user namespace mapping enabled. While without user namespaces switching the user context works fine, it fails with

[root@ffb6964e14f4 /]# gosu testuser id
error: failed switching to "testuser": invalid argument

I am not sure if it is actually a gosu issue or a more general limitation?

setup

Dockerfile snippet

ARG USERID=26551
ENV runUID=${USERID}
ARG GROUPID=26551
ENV runGID=${GROUPID}
ARG USERNAME='testuser'
ENV runUSER=${USERNAME}
ARG groupNAME='testgroup'
ENV runGROUP=${groupNAME}
RUN groupadd -g ${runGID}  ${runGROUP} && \
useradd -u ${runUID} -g ${runGID} -r ${runUSER} 

ENV GOSU_VERSION="1.11" 
ENV GOSU_DOWNLOAD_ROOT="https://github.com/tianon/gosu/releases/download/$GOSU_VERSION" 
ENV GOSU_DOWNLOAD_KEY="0x036A9C25BF357DD4"

ENV GOSU_ENTRYPOINT_VERSION="1.0.0"
ENV GOSU_ENTRYPOINT_DOWNLOAD="https://github.com/gisjedi/gosu-entrypoint/releases/download/$GOSU_ENTRYPOINT_VERSION/gosu-entrypoint.sh" 
ENV GOSU_USER ${runUID}:${runGID}

RUN set -x;  gosu ${runUSER} /usr/bin/whoami ; gosu ${runUSER} id

build

during build gosu can actually switch the user to the recently created user (see last line in the Dockerbuild snippet) - I guess, that build does not use user namespaces...

> DOCKER_BUILDKIT=1  docker build --progress=plain  . -t gosu_test:latest
...
#25 [16/16] RUN set -x;  gosu testuser /usr/bin/whoami ; gosu testuser id
#25 0.557 + gosu testuser /usr/bin/whoami
#25 0.570 testuser
#25 0.571 + gosu testuser id
#25 0.580 uid=26551(testuser) gid=26551(testgroup) groups=26551(testgroup)

run

with user namespaces active

> docker run -it gosu_test:latest /bin/bash
[root@9a57e45dc84f /]# gosu testuser id
error: failed switching to "testuser": invalid argument

disabling user namespaces

> docker run --userns=host -it gosu_test:latest /bin/bash
[root@1ea9006e9380 /]# gosu testuser id
uid=26551(testuser) gid=26551(testgroup) groups=26551(testgroup)

question

Can I actually switch the user context in a container with user namespaces in use?

According to the Docker documentation on User NS, binaries relying on setuid might not work.

My suspicion is, that since gosu (or su for that matter) run by under owner '0' is mapped to some UID in /etc/subuid, the capability is not properly usable anymore??

Cheers, Thomas

tianon commented 5 years ago

I recommend against installing with setuid bits, so I don't think that should affect us; are su or sudo able to switch users in the same environment? (Maybe there's something our implementation is missing)

thdesy commented 5 years ago

Hi, unfortunately I have not managed to sitch the user with neither gosu, su, sudo nor chroot so I suspect a general issue (either on my side or as limitation by the kernel/user namespace mapping)

I am not aware, that I have beeninstalling gosu with setuid - I had implicitly assumed, that it also would use it by default?

An example Dockerfile to reproduce the issue might looks like


# syntax=docker/dockerfile:1.0.0-experimental

FROM centos:7

RUN yum update -y && \
    yum install -y ksh telnet which

ARG USERID=26551
ENV runUID=${USERID}
ARG GROUPID=26551
ENV runGID=${GROUPID}
ARG USERNAME='testuser'
ENV runUSER=${USERNAME}
ARG groupNAME='testgroup'
ENV runGROUP=${groupNAME}

RUN groupadd -g ${runGID}  ${runGROUP} && \
    useradd -u ${runUID} -g ${runGID} -r ${runUSER}  

# How to best run a command under another user
#
ENV GOSU_VERSION="1.11" 
ENV GOSU_DOWNLOAD_ROOT="https://github.com/tianon/gosu/releases/download/$GOSU_VERSION" 
ENV GOSU_DOWNLOAD_KEY="0x036A9C25BF357DD4"

ENV GOSU_ENTRYPOINT_VERSION="1.0.0"
ENV GOSU_ENTRYPOINT_DOWNLOAD="https://github.com/gisjedi/gosu-entrypoint/releases/download/$GOSU_ENTRYPOINT_VERSION/gosu-entrypoint.sh" 

RUN set -x \
    && gpg-agent --daemon \
#     && gpg --keyserver pgp.mit.edu --recv-keys $GOSU_DOWNLOAD_KEY \
    && echo "trusted-key $GOSU_DOWNLOAD_KEY" >> /root/.gnupg/gpg.conf \
    && curl -sSL "$GOSU_DOWNLOAD_ROOT/gosu-amd64" > gosu \
    && curl -sSL "$GOSU_DOWNLOAD_ROOT/gosu-amd64.asc" > gosu.asc \
#     && gpg --verify gosu.asc \
    && rm -f gosu.asc \
    && mv gosu /usr/bin/gosu \
    && chmod +x /usr/bin/gosu \
    && gosu nobody true \
    && rm -rf /root/.gnupg \
    && curl -sSL "$GOSU_ENTRYPOINT_DOWNLOAD" > /gosu-entrypoint.sh \
    && chmod +x /gosu-entrypoint.sh

# Specify any standard chown format (uid, uid:gid), default to root:root
ENV GOSU_USER ${runUID}:${runGID}

RUN set -x;  gosu ${runUSER} /usr/bin/whoami ; gosu ${runUSER} id

To run dockerd with user namespaces, it has to be enabled for the daemon with a matching user with sub-UID/GID ranges defined as well (there is actually a --userns switch for docker run, but I have not managed to get it working...??)

cat > /etc/docker/daemon.json
{
  "userns-remap": "dockeruser"
}

> cat /etc/subuid
dockeruser:120000:10000

> cat /etc/subgid
dockeruser:120000:10000
tianon commented 5 years ago

I think there's likely something like SELinux, AppArmor, etc at play here, because I'm not able to reproduce: (using Docker-in-Docker just so it's easy to set up a daemon with user namespaces enabled -- otherwise should be the same as running on my host)

$ docker pull docker:dind
dind: Pulling from library/docker
Digest: sha256:b1d2ce671095c7509cefba4523fd82c9c9faa330fb4052001233ae7072ea89e0
Status: Image is up to date for docker:dind

$ docker run -dit --privileged --name dind docker:dind --userns-remap default
bb5b28c610efe9232b10ee0cdc19f1adb7c3ca7db1f065d572dd775809bb0575

$ docker logs --tail=2 dind
INFO[2019-08-20T19:34:28.049381568Z] API listen on [::]:2376                      
INFO[2019-08-20T19:34:28.049395291Z] API listen on /var/run/docker.sock           

$ docker exec -it dind sh
/ # docker info | grep -i user
  userns
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

/ # docker run -it --rm alpine:3.10
Unable to find image 'alpine:3.10' locally
3.10: Pulling from library/alpine
050382585609: Pull complete 
Digest: sha256:6a92cd1fcdc8d8cdec60f33dda4db2cb1fcdcacf3410a8e05b3741f44a9b5998
Status: Downloaded newer image for alpine:3.10
/ # su -s /bin/sh nobody -c id
uid=65534(nobody) gid=65534(nobody) groups=65534(nobody)
thdesy commented 5 years ago

hmmm, I have no explicitly disabled SELinux (from permissive), but still do not get it working (I am running on Fedora 29/5.1.21-200). But it might be something at odd with the group handling in my container (dockerd??).

I am currently wondering, that in the container gosu/su seem not to be able to actually switch the group - where su is complaining, that none of the groups I tried exists (but which are in /etc/groups)

[root@5e5137b95434 /]# su testuser
su: cannot set groups: Invalid argument

[root@5e5137b95434 /]# gosu nobody id
uid=99(nobody) gid=99(nobody) groups=99(nobody)

[root@5e5137b95434 /]# su --group=99 testuser id
su: group 99 does not exist
[root@5e5137b95434 /]# su nobody id
This account is currently not available.

[root@5e5137b95434 /]# cat /etc/group
root:x:0:
bin:x:1:
daemon:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mem:x:8:
kmem:x:9:
wheel:x:10:
cdrom:x:11:
mail:x:12:
man:x:15:
dialout:x:18:
floppy:x:19:
games:x:20:
tape:x:33:
video:x:39:
ftp:x:50:
lock:x:54:
audio:x:63:
nobody:x:99:
users:x:100:
utmp:x:22:
utempter:x:35:
input:x:999:
systemd-journal:x:190:
systemd-network:x:192:
dbus:x:81:
testgroup:x:26551:

[root@5e5137b95434 /]# su --group=120000 testuser id
su: group 120000 does not exist

[root@5e5137b95434 /]# gosu 26551:26551 id
error: failed switching to "26551:26551": invalid argument
thdesy commented 5 years ago

...and I just realized that I can actually switch to my host user's UID:GID

[root@5e5137b95434 /]# gosu 1000:1000 id
uid=1000 gid=1000 groups=1000
[root@5e5137b95434 /]# gosu 1000:1000 whomai
error: exec: "whomai": executable file not found in $PATH
[root@5e5137b95434 /]# gosu 1000:1000 echo "testfooout" > /home/testuser/1000_1000

> host > ls -all /tmp/foo/1000_1000 
-rw-r--r-- 1 120000 120000 11 Aug 21 10:02 /tmp/foo/1000_1000
> host > id
uid=1000(hartlocal) gid=1000(hartlocal) groups=1000(hartlocal),10(wheel),969(docker)

the containerized processes are running on the mapped UID:GID 120000

[root@5e5137b95434 /]# gosu 1000:1000 sleep 600

> host > cat /proc/5737/cmdline 
sleep600
> host > ls -all /proc/5737/cmdline 
-r--r--r-- 1 121000 121000 0 Aug 21 10:04 /proc/5737/cmdline

either there is something totally amiss or I have understood user namespaces totally wrong ;)

thdesy commented 5 years ago

sorry to spam again... ;)

explicitly starting a container with the host's user namespace allows again the user switching. Container processes run also on the host under the same UID:GID as created in the build (the UID:GID are nominally not existing on the host). Files written to bind-mounted paths are owned outside by root independent of the actual container user.

> host > docker run --userns=host --volume /tmp/foo:/home/testuser -it gosu_test:latest

[root@15fb8737a272 /]# gosu testuser id
uid=26551(testuser) gid=26551(testgroup) groups=26551(testgroup)
[root@15fb8737a272 /]# gosu testuser echo "nouserns" > /home/testuser/nouserns.txt
[root@15fb8737a272 /]# gosu 1000:1000  echo "nouserns 1000:1000" > /home/testuser/nouserns_1000_1000.txt
[root@15fb8737a272 /]# gosu testuser id       
uid=26551(testuser) gid=26551(testgroup) groups=26551(testgroup)
[root@15fb8737a272 /]# gosu 26551:26551  echo "nouserns 26551:26551" > /home/testuser/nouserns_26551_26551.txt

> host > ls -nall /tmp/foo/
-rw-r--r--  1 120000 120000  11 Aug 21 10:02 1000_1000
-rw-r--r--  1      0      0  19 Aug 21 11:17 nouserns_1000_1000.txt
-rw-r--r--  1      0      0  21 Aug 21 11:18 nouserns_26551_26551.txt
-rw-r--r--  1      0      0   9 Aug 21 11:10 nouserns.txt
-rw-r--r--  1 120000 120000   3 Aug 21 09:45 test

[root@15fb8737a272 /]# gosu testuser sleep 500
> host > ls -nall /proc/8921/cmdline 
-r--r--r-- 1 26551 26551 0 Aug 21 11:12 /proc/8921/cmdline
> host > cat /proc/8921/cmdline 
sleep500
yosifkit commented 5 years ago

Files written to bind-mounted paths are owned outside by root independent of the actual container user.

Nope, the only part of your process running as a user is the echo; the writing to file (> /home/testuser/nouserns_1000_1000.txt) is run by the bash shell that is running as root

[root@5e5137b95434 /]# gosu 1000:1000 sleep 600
host > cat /proc/5737/cmdline 
sleep600
host > ls -all /proc/5737/cmdline 
-r--r--r-- 1 121000 121000 0 Aug 21 10:04 /proc/5737/cmdline

You only gave the namespace 10000 user and 10000 group ids, so you can't go above user or group 9999 within the namespace (ids 0-9999).

cat /etc/subuid dockeruser:120000:10000

cat /etc/subgid dockeruser:120000:10000

So it looks like it is working fine, closing

thdesy commented 5 years ago

ah, shoot - you are right of course! after giving the user namespace sufficient IDs dockeruser:120000:100000 I can now also switch to 26551

Sorry for the noise