Open joanbm opened 3 years ago
Workaround SGTM, could you open PR?
I haven't done any serious testing with the workaround yet and I'd like a more generic solution not tied to RootlessKit, if I can free up some time I'll take a look at making a PR.
It's great if we can have a more generic solution, but do we have a better approach?
Looks like the uid needs to be somehow explicitly input/shared in this case (though the naming of env variable should be aligned). Even no dependency on busctl
cannot bypass the problem since the procfs is not accessible.
A small update...
One of the two usages of DetectUID
seems superfluous, it is used for authentication in DBus with the AUTH EXTERNAL
method (https://dbus.freedesktop.org/doc/dbus-specification.html#auth-mechanisms), but as far as I can tell, it's not mandatory to send the UID along with the AUTH EXTERNAL
method, this is hinted at in the documentation and it can be seen in the systemd source: https://github.com/systemd/systemd/blob/e8b08edcdf4e3f22be0a209cacb9e5404fee4b68/src/libsystemd/sd-bus/bus-socket.c#L312 . In fact there's a recent (yet unreleased) go-dbus commit which does exactly this: https://github.com/godbus/dbus/commit/31b5df72caaf5c68ec5ff414944e8ab8c24f8c52
The other usage is to decide whether to use the regular or rootless systemd cgroups manager, but I don't see any way to avoid the UID detection logic there.
As far as workarounds go, instead of looking for ROOTLESSKIT_PARENT_EUID
, it seems cleaner to look at /proc/$$/cgroup
for a fragment like /user-NNNN.slice/
to guess the UID, at least this way it's not tied to RootlessKit.
Attempting to run a container on Rootless Docker will fail when both of the following system settings are active:
cgroup_no_v1=all
kernel command line parameter) backed by Systemdsudo mount -o remount,hidepid=2 /proc
), i.e. each user can only see his own processesRepro'd on Ubuntu 20.04 with Docker installed from Ubuntu PPAs (latest version, with Docker 20.10.7 + Containerd 1.4.9 + runc v1.0.1-0-g4144b63), also current Arch Linux.
Log of the problem:
I can also reproduce the error directly with runc without Docker; I just need run the rootful runc example, with
--systemd-cgroup
, and insiderootlesskit
just like Docker does (both conditions are necessary, otherwise the container runs fine):--
I did some debugging and it appears that the problem happens because runc is running
busctl --user status
here in order to get the OwnedUID value from the output: https://github.com/opencontainers/runc/blob/51beb5c436b159ae2d483b219c37ecfde13b006a/libcontainer/cgroups/systemd/user.go#L60It appears that OwnerUID is not listed when /proc is mounted with hidepid=2:
With strace can see that
busctl --user status
is trying to read/proc/1/cgroup
, which it can't because of hidepid=2.From what I can see systemd are not big fans of hidepid=2 (e.g. https://lists.freedesktop.org/archives/systemd-devel/2012-October/006860.html, https://github.com/systemd/systemd/issues/12955) so I guess this could be a NOTOURBUG on runc and a WONTFIX on systemd, but it would be nice if we could have some equivalent logic that does not depend on busctl and avoid this issue.
For now as a workaround, I can create containers if I get the UID from the
ROOTLESSKIT_PARENT_EUID
environment variable instead: