Closed jinnko closed 3 years ago
Always open for PRs!
I think the fix is to try and chown + chmod s6-overlay-preinit
before running it. I'm guessing this would fail if somebody's running a container with a read-only root filesystem or using the --user
switch, it would need to be wrapped in foreground
blocks.
A hard requirement is we only use tools that ship in the tarball, so you'll need to use s6-chown
, s6-chmod
, execline, etc.
I have now spent exactly 25 minutes using execlineb
so I wouldn't want to submit this as a full PR because I suspect there is a much cleaner method, and I honestly didn't work through all the security implications. As a reference however, I just forced a ownership change by changing the last line of /init
from: /etc/s6/init/init-stage1 $@
to:
backtick -n euid { id -u } importas -u euid euid
backtick -n egid { id -g } importas -u egid egid
foreground { /bin/s6-chown -u ${euid} -g ${egid} /bin/s6-overlay-preinit } /etc/s6/init/init-stage1 $@
It would probably be good to run the whole thing in a foreground block, right now this would be introducing the euid
and egid
variables into the whole supervision tree. If it's in a foreground block, it runs as a forked process and doesn't change the environment.
s6-chown
has the interesting property of being able to use -U
to read UID
and GID
environment variables - importas
really just reads the rest of the script and substitutes variables into the CLI (this is a huge oversimplification), but if we use -U
in s6-chown
we can skip that. So this could be simplified a bit:
foreground {
backtick -n UID { id -u }
backtick -n GID { id -g }
foreground { /bin/s6-chown -U /bin/s6-overlay-preinit }
} /etc/s6/init/init-stage1 $@
Only issue here is id
is a binary on the system, we don't rely on any pre-installed binaries, just the ones we include in the tarball, I don't think there's anything in the s6-ecosystem that can grab the current UID/GID.
@skarnet would it make sense for s6-portable-utils to get an id
replacement, or for s6
to get a command like s6-getuidgid
that exports the current UID/GID into environment variables and exec's into a program? If those don't really have much use outside of this project I can just try writing my own.
@jprjr Well it wouldn't really make sense outside of suid binaries. If you want to s6-chown
a file, it means you're root, so you know your uid and gid...
If the user you want to chown to is listed in /etc/passwd
, then s6-envuidgid user
will put the correct value in the UID
and GID
environment variables. If it's not, I'm afraid you'll have to rely on id
or bundle your own.
At a minimum we want to check the ownership of /bin/s6-overlay-preinit
as that should always be root, but in the scenario that triggers this it will be a UID above 64k. If it isn't owned by UID 0, we'd want to chown root
.
I'm not familiar enough with the s6 tools like exceline, but based on the suggestion above, how about something like this..
foreground {
backtick -n REMAPPED_ROOT_UID { /usr/bin/stat -c '%u' /bin/s6-overlay-preinit }
if { s6-test $REMAPPED_ROOT_UID -ne 0 } /bin/s6-chown -u 0 /bin/s6-overlay-preinit
} /etc/s6/init/init-stage1 $@
Unfortunately this does end up depending on stat
.
I don't think using the id
binary is achieving the correct goal as the runtime context will always return UID 0, but the container may have been built on a dockerd running with userns-remap enabled..
UPDATE: changed the check from /
to the specific script as this is explicitly where the issue lies.
This might be similar to the permission issues I get when running docker 20.10 as rootless.
Are we overcomplicating this? The idea is s6-overlay-preinit should run as root no matter what the --user
flag was for the container.
Should this just be:
foreground {
/bin/s6-chown -u 0 /bin/s6-overlay-preinit
} /etc/s6/init/init-stage1 $@
If you're running with --user
, you'd get an error from s6-chown
, but it's not fatal since it's all in a foreground
block anyway.
Thanks for looking at this. Perhaps just doing a blind chown may simplify things. I tried all sorts of variations to the execlineb init script a couple of weeks ago without success, though possibly not a blind chown. It did seem that only changing ownership of s6-overlay-preinit
wasn't sufficient though, and I had run out of other ideas on how to make this work otherwise.
The issue isn't about the --user
flag, but rather about the docker daemon configured with userns-remap
, see user namespace isolation. Long story short, add these configs then restart the docker daemon (note - don't do this on an active docker daemon as the structures of /var/lib/docker
change - revert config to go back, but you won't see docker daemon state migrated in either direction).
$ cat /etc/docker/daemon.json
{
"userns-remap": "dockremap",
}
$ grep dockremap /etc/subuid /etc/subgid
/etc/subuid:dockremap:493216:65536
/etc/subgid:dockremap:493216:65536
When the docker daemon is running as such the pulled images and builds will inherit the remapped UIDs. The problem arises when you run a container on the daemon with the --userns=host
argument which results in the UIDs going back to the standard host UIDs. The reason you'd do this is if you want to use user namespace remapping primarily which is only possibly by configuring the docker daemon - i.e. it's not possible to opt in to userns remapping at runtime.
Unfortunately one of the reason it is necessary to run a container in the host user namespace like this is when that container must also be on the host network namespace - i.e. you can't have the host network namespace without the host user namespace. This is the case for the linuxserver/docker-unifi-controller container as it doesn't function correctly behind a NAT.
A few weeks ago I did try again to make this all work and after a few hours of tinkering unsuccessfully I fell back to patching the upstream container like this.
$ cat Dockerfile
FROM linuxserver/unifi-controller:latest
ADD init-wrapper /init-wrapper
ENTRYPOINT ["/init-wrapper"]
$ cat init-wrapper
#!/bin/sh
REMAPPED_ROOT_UID="$(stat -c '%u' /)"
if [ "$REMAPPED_ROOT_UID" -ne 0 ]; then
find / -xdev -perm /4000 -uid "$REMAPPED_ROOT_UID" -exec chown root '{}' \;
fi
echo "Executing /init $@"
exec /init $@
Right, sorry, I just brought up --user
in terms of how my proposed fix would affect that use-case - that it would still work (maybe just toss up an non-fatal error message).
So what I'm wondering is, it seems like the core issue is - SUID binaries need to have their UID set to 0.
I know I got hung up on trying to identify at runtime whether this is running in a userns-remap scenario, and only running chown
if it's needed. But the only SUID binary we ship is the s6-overlay-preinit, and since it should always be owned by root anyway - would it be sufficient to just run s6-chown
on it no matter what? That saves us having to write some kind of stat
-like utility to include.
Good question - I believe it wasn't sufficient, but I will have another go using your proposed approach over the weekend and let you know my findings.
Tbh it's not surprising that suid binaries don't play nice with user namespaces and uid squashing and whatever mutant Linux feature is at play here.
@jprjr : why does s6-overlay-preinit
need to be suid exactly? what does it do and why does it need root powers while being run as an unprivileged user?
@skarnet s6-overlay-preinit
really takes care of two things.
Thing 1: /var/run/s6
needs to exist. If the container is started with a read-only root filesystem with /var
set as a tmpfs, s6-overlay-preinit
takes care of making sure /var/run
and /var/run/s6
exist.
Thing 2: /var/run/s6
needs to be writable by the process tree user. Whether or not you're using a read-only root filesystem, if you start a container with the --user
switch, you'll run the whole shebang as Not Root. s6-overlay-preinit
will chown /var/run/s6
to whatever user the init system is running as, so operations like, copying service directories in, etc work.
Now that I've written that out, maybe s6-overlay-preinit
could try only calling chown
if the euid is root.
If the euid isn't 0, the chown
is likely to fail anyway, and I think in this issue's case, the rest of the process tree is actually running as root, it's just this particular process not running as root.
@jinnko If you could try this build of s6-overlay-preinit
, that would be a huge help: https://github.com/just-containers/s6-overlay-preinit/suites/1837529552/artifacts/35531088 - I don't have a Docker setup that I can easily switch to userns remapping.
Here's a Dockerfile that will download, unzip, etc automatically - this is a build of overlay-preinit that only tries to chown the /var/run/s6 folder if the EUID is 0.
FROM linuxserver/unifi-controller:latest
RUN apt-get update && apt-get install -y unzip
RUN curl -R -L -o /tmp/dist.zip \
https://nightly.link/just-containers/s6-overlay-preinit/workflows/all/issue-309/dist.zip
RUN cd /tmp && unzip -d dist dist.zip
RUN tar xzf /tmp/dist/s6-overlay-preinit-1.0.4-linux-amd64.tar.gz -C /
@jprjr The build worked perfectly on a docker daemon running with userns-remap
enabled and a container launched with --userns_mode=host --network_mode=host
. The resulting file modes for relevant files in the container are:
# ls -sl /bin/s6-overlay-preinit
12 -rwsr-xr-x 1 493216 493216 9120 Jan 15 16:44 /bin/s6-overlay-preinit
# ls -al /var/run/s6
total 0
drwxr-xr-x 1 493216 root 124 Jan 17 16:32 .
drwxr-xr-x 1 493216 493216 54 Jan 17 16:32 ..
drwxr-xr-x 1 root root 86 Jan 17 16:32 container_environment
drwxr-xr-x 1 493216 493216 8 Jan 17 16:32 env-stage1
drwxr-xr-x 1 root root 0 Jan 17 16:32 env-stage2
drwxr-xr-x 1 root root 0 Jan 17 16:32 env-stage3
drwxr-xr-x 1 root root 90 Jan 17 16:32 etc
drwxr-xr-x 1 493216 493216 54 Jan 17 16:32 services
Thinking through the various launch scenarios, these are the outcomes I would expect:
docker run --user ...
)userns_remap
), with container launched as root insideuserns_remap
), with container launched as another user (docker run --user ...
)I think the 4th scenario would fail currently, and I think it would still fail with the update provided. Having said that this is still a step forward and I'd suggest it's worth rolling with.
To give an idea of why it's important to support userns_remap
, consider all the people out there running docker on their workstations such that their regular user can launch containers. Without userns_remap
it's trivial for a launched container to have root access to the whole system, whereas with remapping enabled any launched container would appear to have root inside the container, but not on the whole filesystem externally. So really if we add our regular user to the docker
group we should generally also be enabling remapping. Obviously the benefits aren't limited to workstations.
Thanks for helping solve this - very much appreciated. The specific scenario I hit is resolved so I'm happy with the outcome. I'll leave the decision about closing this issue to you as there's still one more scenario that probably needs attention, but it does seem a bit of an edge case.
Latest version v2.2.0.0
includes the updated justc-envdir
: https://github.com/just-containers/s6-overlay/releases/tag/v2.2.0.0
Going to close this out, that fourth use-case is still an issue, I think the only really good workaround for that is to create /var/run/s6
at build-time and chown it to whoever the eventual USER
is. I suspect this case will be really, really niche.
Raising this issue here as I suspect a recent change may have changed the behaviour. Originally raised this in https://github.com/linuxserver/docker-unifi-controller/issues/62
When running dockerd with
userns-remap
, then starting up a container with--userns=host
, I get the following error message:Inspecting the contents of the container I found:
I found an old ticket for dockerd (moby) that seems to indicate this is to be expected, however this wasn't a problem until just this past week.
So the issue is that when the container is started the ownership of
/bin/s6-overlay-preinit
is the remapped UID, and given it's SUID, it doesn't run as the intended user. This is also confirmed in the docker documentation: https://docs.docker.com/engine/security/userns-remap/#disable-namespace-remapping-for-a-containerHas anything changed in this or a related repo that may be related?
Would you be open to a PR that ensures the ownership of
/bin/s6-overlay-preinit
is correct at runtime?