Closed rhatdan closed 5 years ago
@wking I think I fixed your issues.
nit: PR subject has "ssystemd" when it should be "systemd".
I really need someone to test this to see if it fixes the issue people are seeing in podman.
@rhatdan I hit the following error:
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13804]: systemdhook <error>: 6cab0d8c9dc8: pid not found in state: Success
Jun 21 08:42:07 podman.localdomain conmon[13831]: conmon 6cab0d8c9dc817e69fd0 <ninfo>: about to waitpid: 13832
Jun 21 08:42:07 podman.localdomain kernel: SELinux: mount invalid. Same superblock, different security settings for (dev mqueue, type mqueue)
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <debug>: 6cab0d8c9dc8: rootfs=/var/lib/containers/storage/overlay/3518f024987889575e3e80e887ff2a86daf4c9f927f8>
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <debug>: 6cab0d8c9dc8: gidMappings not found in config
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <debug>: 6cab0d8c9dc8: GID: 0
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <debug>: 6cab0d8c9dc8: uidMappings not found in config
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <debug>: 6cab0d8c9dc8: UID: 0
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <error>: 6cab0d8c9dc8: Failed to remove /run/oci-systemd-hook.cGqlEG: Device or resource busy
Jun 21 08:42:07 podman.localdomain conmon[13831]: conmon 6cab0d8c9dc817e69fd0 <error>: Failed to create container: exit status 1
Jun 21 08:42:07 podman.localdomain oci-systemd-hook[13860]: systemdhook <error>: 6cab0d8c9dc8: Failed to remove /run/oci-systemd-hook.cGqlEG: Device or resource busy
I don't know what's going on with that (maybe we need an umount
to unwind our new mount
of tmp_dir
onto tmp_dir
?), but while reading the code I turned up this additional question. Would switching to the simpler guard I propose there help with debugging this breakage?
@wking I reworked how you suggested. I am no longer creating the additional mount point. We only need to make private the tmpfs mounted on tmp_dir, I believe, which this patch does, and then only umount it when the code fails.
@aalba6675 Could you try the latest to see if it works any better?
@rhatdan Got the same error as /tmp/ocitmp.XXXX
as /run
here (F28) has propagation shared
:
oci-systemd-hook[9252]: systemdhook <error>: 6cab0d8c9dc8: Failed to move mount /run/oci-systemd-hook.9dlu3E to /var/lib/containers/storage/overlay/3518f024987889575e3e80e887ff2a86daf4c9f927f8d756e58d4b902457c37b/merged/run: Invalid argument
When I set /run
to private
it works.
Separate enquiry; I see
Jun 24 20:26:47 podman.com oci-systemd-hook[9494]: systemdhook <debug>: 6cab0d8c9dc8: Found cgroup
Jun 24 20:26:47 podman.com oci-systemd-hook[9494]: systemdhook <debug>: 6cab0d8c9dc8: PATH: /libpod_parent/libpod-6cab0d8c9dc817e69fd0c02a7657e9a83edab3903f25f50d758c1511283bbbf0/ctr
Jun 24 20:26:47 podman.com oci-systemd-hook[9494]: systemdhook <debug>: 6cab0d8c9dc8: SUBSYSTEM_PATH: /sys/fs/cgroup/systemd/libpod_parent/libpod-6cab0d8c9dc817e69fd0c02a7657e9a83edab3903f25f50d758c1511283bbbf0/ctr
Is this why the host sees a doubled path; IOW, PATH is concatenated to the end of SUBSYSTEM_PATH ?
Your example shows you making a dir, but not mounting on it.
Oops, right. But later on you mount tmp_dir
onto mount_dir
, so maybe that needs cleanup code?
We don't cleanup the mount_dir since it is on a tmpfs /run.
Ah, (eventual) tmpfs cleanup makes sense. But mount_dir
is under rootfs
. Must rootfs
always be under /run
?
With latest podman I am seeing no leaking. I still updated this package.
@wking @mrunalp @lsm5 PTAL People are still reporting issues, even though I was not seeing them.
@aalba6675 @thoraxe PTAL
oci-systemd-hook[23858]: systemdhook <error>: 0cebf3cae8d7: Failed to move mount /tmp/oci-systemd-hook.rD2ygS to /var/lib/containers/storage/overlay/4e9473a4456abeba9bd112c8760a6bee48a0e83ab80be5fce188d263fec614d7/merged/run: Invalid argument
/sys/fs/cgroup/systemd/libpod_parent/libpod-0cebf3cae8d7e554b2647893c5032354b2d843e5052ff7b4a29c28c82ed167c1
on the host. This can be umount'ed manually. The double-pathing libpod_parent/libpod-<container_uuid>/libpod_parent/libpod-<container_uuid>
(containers with volume mounts) doesn't happen anymore!@aalba6675 What is the exact Podman command you are seeing this with? And what is the Dockerfile you used to generate the image?
@rhatdan - reproducer, oci-systemd-hook has #98 applied
# rpm -q oci-systemd-hook podman buildah
oci-systemd-hook-0.1.17-3.gitbd86a79.fc28.x86_64
podman-0.7.4-4.git80612fb.fc28.x86_64
buildah-1.3-1.git4888163.fc28.x86_64
Image:
CONT=$(buildah from centos:7)
buildah run $CONT yum -y install systemd openssh-server tmux rsync sudo
buildah run $CONT systemctl enable sshd
buildah run $CONT bash -c 'chpasswd <<< root:root_secure_password'
buildah commit $CONT c7test:1
Containers (with and w/o volume):
podman create --name test_ruby --entrypoint /sbin/init --stop-signal RTMIN+3 --network none c7test:1
podman create --name test_gold --entrypoint /sbin/init --stop-signal RTMIN+3 --network none -v /srv/docker/volumes/vagrant/home:/home:z c7test:1
Test 1: result: PASS
# force defaults for /tmp, container does not have host volumes
mount --make-shared /tmp
podman start test_ruby
# check if systemd is running
# podman exec test_ruby ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 01:45 ? 00:00:00 /sbin/init
root 19 1 0 01:45 ? 00:00:00 /usr/lib/systemd/systemd-journald
root 24 1 0 01:45 ? 00:00:00 /usr/sbin/sshd -D
root 26 1 0 01:45 ? 00:00:00 /usr/lib/systemd/systemd-logind
dbus 27 1 0 01:45 ? 00:00:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 30 0 0 01:45 ? 00:00:00 ps -ef
Test 2a: result: FAIL
mount --make-shared /tmp
podman start test_gold
systemdhook <error>: 06f83779973c: Failed to move mount /tmp/oci-systemd-hook.l2Cd83 to /var/lib/containers/storage/overlay/1f3182c51b31f2909fb8e369a0cc6ecff09150687984575c0907b43f9530d1c8/merged/run: Invalid argument
Test 2b: result: PASS??
mount --make-private /tmp
podman start test_gold
## container starts! but one cgroup still leaking...
## on host
# mount | grep libpod
cgroup on /sys/fs/cgroup/systemd/libpod_parent/libpod-06f83779973c0c88537c20bb28e9215998eaab7f146750539b0e05e612c3132d type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,name=systemd)
## yay! systemd is running in test_gold with host volumes!
# podman exec test_gold ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 01:53 ? 00:00:00 /sbin/init
root 18 1 0 01:53 ? 00:00:00 /usr/lib/systemd/systemd-journald
dbus 24 1 0 01:53 ? 00:00:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 26 1 0 01:53 ? 00:00:00 /usr/lib/systemd/systemd-logind
root 27 1 0 01:53 ? 00:00:00 /usr/sbin/sshd -D
root 30 0 0 01:53 ? 00:00:00 ps -ef
##
[root@podhost187 ~]# podman stop test_gold
06f83779973c0c88537c20bb28e9215998eaab7f146750539b0e05e612c3132d
[root@podhost187 ~]# mount | grep libpod
cgroup on /sys/fs/cgroup/systemd/libpod_parent/libpod-06f83779973c0c88537c20bb28e9215998eaab7f146750539b0e05e612c3132d type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,name=systemd)
## this is the final cgroup leakage
@rhatdan new reproducer with different paths with 0.8.2 (systemd as the default cgroup-manager):
# rpm -q podman
podman-0.8.2.1-1.gitf38eb4f.fc28.x86_64
podman run -t --name=systemd --env=container=podman --entrypoint=/sbin/init --stop-signal=RTMIN+3 -v /volumes/vagrant/home:/home:z fedora:28
mount | grep libpod
# mount | grep libpod
cgroup on /sys/fs/cgroup/systemd/system.slice/libpod-82265d6d94512df4a1cfd244c4cdccdaad16356f1332f9a2ed6a13c0aae1f3c9.scope type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,name=systemd)
podman stop systemd
## the cgroup mount is leaked
# mount | grep libpod
cgroup on /sys/fs/cgroup/systemd/system.slice/libpod-82265d6d94512df4a1cfd244c4cdccdaad16356f1332f9a2ed6a13c0aae1f3c9.scope type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,name=systemd)
Since we have directly integrated systemd support into podman, I am going to close this PR.
We are leaking mount points into the shared mount space, by mounting the directory private we are able to make changes and not have the mount points leak.
Signed-off-by: Daniel J Walsh dwalsh@redhat.com