cgroups: ignore ownership for cgroup v1 controllers

giuseppe commented 1 month ago

if the system is running on cgroup v2, ignore the ownership of cgroup v1 controllers when the current process is at the root cgroup.

Closes: https://github.com/containers/podman/issues/23990

Luap99 commented 1 month ago

So is the problem here that podman moves itself out of the systemd unit cgroup thus the cgroup tracking of systemd fails and it will stop the unit as there are no processes in it?

giuseppe commented 1 month ago

So is the problem here that podman moves itself out of the systemd unit cgroup thus the cgroup tracking of systemd fails and it will stop the unit as there are no processes in it?

yes exactly. Podman moves itself out of the systemd scope and then systemd doesn't find any process running in the cgroup it has created

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/containers/common/blob/main/OWNERS)~~ [Luap99,giuseppe] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

rhatdan commented 1 month ago

/lgtm

Arcitec commented 1 month ago

@giuseppe I'm interested in the flow of the fix, since it's a bit hard to understand.

It looks like the UserOwnsCurrentSystemdCgroup code scans every cgroup on the system?

And when podman encountered a cgroup v1 on a cgroup v2 system, it checked if it (podman or podman's current user) owns that cgroup v1, and then incorrectly returned false from UserOwnsCurrentSystemdCgroup because it wasn't the owner of the cgroup1, which it shouldn't even have been checking at all? So podman basically incorrectly went "I am not the owner of my own cgroup, fail" whenever it encountered any cgroup1 mount point on a cgroup2 system?

And then when it (incorrectly) determined "I am not the owner of my own cgroup (UserOwnsCurrentSystemdCgroup returned false)", podman then "moves itself out of its own cgroup (the systemd cgroup2 delegate group owned by podman)", which then finally causes systemd to shut down podman since it's no longer managed by systemd's unified cgroup2?

Did I understand the flow of the bug correctly?

The fix was to totally ignore all cgroup v1 groups on a unified cgroup v2 system?

The thing I really don't understand is the fix's if-statement:

    for scanner.Scan() {
        line := scanner.Text()
        parts := strings.SplitN(line, ":", 3)
        if len(parts) < 3 {
            continue
        }

        // If we are on a cgroup v2 system and there are cgroup v1 controllers
        // mounted, ignore them when the current process is at the root cgroup.
        if cgroup2 && parts[1] != "" && parts[2] == "/" {
            continue
        }

The cgroup2 check makes sense because that flag is set outside the loop and means "this is a unified cgroup2 system". Perfect.

But what's going on with the parts[1] != "" && parts[2] == "/"? Those two are being set based on a split string from the current cgroup that the scanner is analyzing. How does that help it determine that "ignore them when the current process is at the root cgroup" check?

I guess it's something like this:

cgroup2 = true if system uses unified cgroup2 and we are in the root cgroup
parts[1] != "" && parts[2] == "/" = determines that the current group the scanner is analyzing is a cgroup1? Is this checking "if it's the root of a cgroup1 mount point" or something?

Is that it?

Arcitec commented 1 month ago

By the way, do you still need me to check the things you asked about before this patch? Here:

https://github.com/containers/podman/issues/23990#issuecomment-2357709698

I'm also curious if you were able to try this fix with a cgroup manually created at /opt/piavpn/etc/cgroup/net_cls to make sure that's working for that particular case? I guess you did but checking just in case.

I'd do a test with PIA VPN in a VM to help out if I knew how to build podman from scratch, but it seems like a complicated multi-repo build. :x

Luap99 commented 1 month ago

@Arcitec https://github.com/containers/podman/pull/24006

To test go to the packit rpm-build job for fedora 40 there, once they are down follow the dashboard link then follow the steps there on how to install the rpms if you want to test without building from source, otherwise just build my PR from source.

Direct link https://dashboard.packit.dev/jobs/copr/1881809 builds are done. It is enough if you only install the "podman" rpm you do not need all the other extra packages there, debuginfo,etc.. I think.

Arcitec commented 1 month ago

@Luap99 Brilliant, thank you very much for that.

I followed the instructions at https://dashboard.packit.dev/jobs/copr/1881809 and upgraded Fedora 40 VM from Podman 5.2.2 to 5.3.0-dev.

Then I re-enabled PIA VPN and its /opt/piavpn/etc/cgroup/net_cls cgroup v1 and rebooted.

Verified that the cgroup v1 and v2 are co-existing again:

$ mount | grep -i cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)
none on /opt/piavpn/etc/cgroup/net_cls type cgroup (rw,relatime,seclabel,net_cls)

And then the moment we've all been waiting for, checking if podman is now able to start:

Click for log

``` johnny@fedora:~$ podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f69e752cdaf7 docker.io/library/alpine:latest sleep infinity About a minute ago Up About a minute systemd-testquadlet johnny@fedora:~$ systemctl status --user testquadlet ● testquadlet.service - Sleep Container Loaded: loaded (/home/johnny/.config/containers/systemd/testquadlet.container; generated) Drop-In: /usr/lib/systemd/user/service.d └─10-timeout-abort.conf Active: active (running) since Wed 2024-09-18 19:45:14 CEST; 1min 22s ago Main PID: 3656 (conmon) Tasks: 3 (limit: 4621) Memory: 64.1M (peak: 91.3M) CPU: 174ms CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/testquadlet.service ├─libpod-payload-f69e752cdaf7e481b53449216cd5a7fbd1acbae523636da1f3e0fc62272dab3f │ └─3663 sleep infinity └─runtime ├─3645 /usr/bin/pasta --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/net> └─3656 /usr/bin/conmon --api-version 1 -c f69e752cdaf7e481b53449216cd5a7fbd1acbae523636da1f3e0fc62272dab3f -u f69e752cdaf7e481b53449216cd5a> Sep 18 19:45:14 fedora podman[3473]: 2024-09-18 19:45:14.273524757 +0200 CEST m=+0.417212520 container init f69e752cdaf7e481b53449216cd5a7fbd1acbae523636d> Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=debug msg="Starting container f69e752cdaf7e481b53449216cd5a7fbd1acbae5236> Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=debug msg="Started container f69e752cdaf7e481b53449216cd5a7fbd1acbae52363> Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=debug msg="Notify sent successfully" Sep 18 19:45:14 fedora podman[3473]: 2024-09-18 19:45:14.27938255 +0200 CEST m=+0.423070313 container start f69e752cdaf7e481b53449216cd5a7fbd1acbae523636d> Sep 18 19:45:14 fedora systemd[3376]: Started testquadlet.service - Sleep Container. Sep 18 19:45:14 fedora testquadlet[3473]: f69e752cdaf7e481b53449216cd5a7fbd1acbae523636da1f3e0fc62272dab3f Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=debug msg="Called run.PersistentPostRunE(/usr/bin/podman run --name=syste> Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=info msg="Received shutdown.Stop(), terminating!" PID=3473 Sep 18 19:45:14 fedora testquadlet[3473]: time="2024-09-18T19:45:14+02:00" level=debug msg="Shutting down engines" ```

And here's a service restart with detailed system journal:

Click for log

``` $ systemctl --user daemon-reload; systemctl --user restart testquadlet --no-block; journalctl --user -f Sep 18 19:50:19 fedora /usr/bin/podman[4948]: time="2024-09-18T19:50:19+02:00" level=info msg="Setting parallel job count to 7" Sep 18 19:50:19 fedora /usr/bin/podman[4948]: time="2024-09-18T19:50:19+02:00" level=debug msg="Called cleanup.PersistentPostRunE(/usr/bin/podman --root /home/johnny/.local/share/containers/storage --runroot /run/user/1000/containers --log-level debug --cgroup-manager systemd --tmpdir /run/user/1000/libpod/tmp --network-config-dir --network-backend netavark --volumepath /home/johnny/.local/share/containers/storage/volumes --db-backend sqlite --transient-store=false --runtime crun --storage-driver overlay --events-backend journald --syslog container cleanup --stopped-only --rm 8e5486f3ab3f56f15ad9cc9de64e6126222ff531b87f4843697dd12aafffdfbd)" Sep 18 19:50:19 fedora /usr/bin/podman[4948]: time="2024-09-18T19:50:19+02:00" level=debug msg="Shutting down engines" Sep 18 19:50:19 fedora /usr/bin/podman[4948]: time="2024-09-18T19:50:19+02:00" level=info msg="Received shutdown.Stop(), terminating!" PID=4948 Sep 18 19:50:19 fedora systemd[3376]: testquadlet.service: Main process exited, code=exited, status=137/n/a Sep 18 19:50:19 fedora systemd[3376]: testquadlet.service: Failed with result 'exit-code'. Sep 18 19:50:19 fedora systemd[3376]: Stopped testquadlet.service - Sleep Container. Sep 18 19:50:22 fedora systemd[3376]: Reloading requested from client PID 4977 ('systemctl') (unit vte-spawn-7794a8c6-80de-4782-9478-729a7614dfc8.scope)... Sep 18 19:50:22 fedora systemd[3376]: Reloading... Sep 18 19:50:22 fedora systemd[3376]: Reloading finished in 135 ms. Sep 18 19:50:22 fedora systemd[3376]: Starting testquadlet.service - Sleep Container... Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=info msg="/usr/bin/podman filtering at log level debug" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Called run.PersistentPreRunE(/usr/bin/podman run --name=systemd-testquadlet --cidfile=/run/user/1000/testquadlet.cid --replace --rm --cgroups=split --sdnotify=conmon -d --log-level=debug docker.io/alpine sleep infinity)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using conmon: \"/usr/bin/conmon\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=info msg="Using sqlite as database backend" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using graph driver overlay" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using graph root /home/johnny/.local/share/containers/storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using run root /run/user/1000/containers" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using static dir /home/johnny/.local/share/containers/storage/libpod" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using volume path /home/johnny/.local/share/containers/storage/volumes" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using transient store: false" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="[graphdriver] trying provided driver \"overlay\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that overlay is supported" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that overlay is supported" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that metacopy is not being used" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that native-diff is usable" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Initializing event backend journald" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Using OCI runtime \"/usr/bin/crun\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=info msg="Setting parallel job count to 7" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Pulling image docker.io/alpine (policy: missing)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Looking up image \"docker.io/alpine\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Normalized platform linux/amd64 to {amd64 linux [] }" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Trying \"docker.io/library/alpine:latest\" ..." Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="parsed reference into \"[overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/alpine\" as \"docker.io/library/alpine:latest\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/alpine\" as \"docker.io/library/alpine:latest\" in local containers storage ([overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Looking up image \"docker.io/library/alpine:latest\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Normalized platform linux/amd64 to {amd64 linux [] }" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Trying \"docker.io/library/alpine:latest\" ..." Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="parsed reference into \"[overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/library/alpine:latest\" as \"docker.io/library/alpine:latest\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/library/alpine:latest\" as \"docker.io/library/alpine:latest\" in local containers storage ([overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Looking up image \"docker.io/alpine\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Normalized platform linux/amd64 to {amd64 linux [] }" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Trying \"docker.io/library/alpine:latest\" ..." Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="parsed reference into \"[overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/alpine\" as \"docker.io/library/alpine:latest\" in local containers storage" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Found image \"docker.io/alpine\" as \"docker.io/library/alpine:latest\" in local containers storage ([overlay@/home/johnny/.local/share/containers/storage+/run/user/1000/containers]@91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Inspecting image 91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Inspecting image 91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Inspecting image 91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Inspecting image 91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="using systemd mode: false" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="setting container name systemd-testquadlet" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="No hostname set; container's hostname will default to runtime default" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Loading seccomp profile from \"/usr/share/containers/seccomp.json\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Allocated lock 0 for container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="exporting opaque data as blob \"sha256:91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that idmapped mounts for overlay are not supported" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Check for idmapped mounts support " Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Created container \"e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Container \"e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd\" has work directory \"/home/johnny/.local/share/containers/storage/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Container \"e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd\" has run directory \"/run/user/1000/containers/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata\"" Sep 18 19:50:22 fedora podman[4990]: 2024-09-18 19:50:22.324615958 +0200 CEST m=+0.062276350 container create e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd (image=docker.io/library/alpine:latest, name=systemd-testquadlet, PODMAN_SYSTEMD_UNIT=testquadlet.service) Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Made network namespace at /run/user/1000/netns/netns-b55a6ba6-79e3-32af-b38e-c4c8dfb4cd2b for container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="pasta arguments: --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-b55a6ba6-79e3-32af-b38e-c4c8dfb4cd2b --map-guest-addr 169.254.1.2" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Cached value indicated that volatile is being used" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="overlay: mount_data=lowerdir=/home/johnny/.local/share/containers/storage/overlay/l/SIBMF2AYCNVSDIC56KDYB5YVXH,upperdir=/home/johnny/.local/share/containers/storage/overlay/4610bac88f8d53bc11ec6440d93dcf6035297765de931622c10e1e2015fd6583/diff,workdir=/home/johnny/.local/share/containers/storage/overlay/4610bac88f8d53bc11ec6440d93dcf6035297765de931622c10e1e2015fd6583/work,userxattr,volatile,context=\"system_u:object_r:container_file_t:s0:c131,c762\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Mounted container \"e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd\" at \"/home/johnny/.local/share/containers/storage/overlay/4610bac88f8d53bc11ec6440d93dcf6035297765de931622c10e1e2015fd6583/merged\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Created root filesystem for container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd at /home/johnny/.local/share/containers/storage/overlay/4610bac88f8d53bc11ec6440d93dcf6035297765de931622c10e1e2015fd6583/merged" Sep 18 19:50:22 fedora pasta[4999]: Couldn't get any nameserver address Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=info msg="pasta logged warnings: \"Couldn't get any nameserver address\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="/etc/system-fips does not exist on host, not mounting FIPS mode subscription" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Workdir \"/\" resolved to host path \"/home/johnny/.local/share/containers/storage/overlay/4610bac88f8d53bc11ec6440d93dcf6035297765de931622c10e1e2015fd6583/merged\"" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Created OCI spec for container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd at /home/johnny/.local/share/containers/storage/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata/config.json" Sep 18 19:50:22 fedora podman[4990]: 2024-09-18 19:50:22.28596768 +0200 CEST m=+0.023628072 image pull 91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b docker.io/alpine Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="/usr/bin/conmon messages will be logged to syslog" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="running conmon: /usr/bin/conmon" args="[--api-version 1 -c e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd -u e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd -r /usr/bin/crun -b /home/johnny/.local/share/containers/storage/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata -p /run/user/1000/containers/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata/pidfile -n systemd-testquadlet --exit-dir /run/user/1000/libpod/tmp/exits --persist-dir /run/user/1000/libpod/tmp/persist/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd --full-attach -l journald --log-level debug --syslog --conmon-pidfile /run/user/1000/containers/overlay-containers/e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/johnny/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/johnny/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --stopped-only --exit-command-arg --rm --exit-command-arg e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd]" Sep 18 19:50:22 fedora testquadlet[5008]: [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied Sep 18 19:50:22 fedora conmon[5008]: conmon e3576e188dacd2e09052 : failed to write to /proc/self/oom_score_adj: Permission denied Sep 18 19:50:22 fedora conmon[5009]: conmon e3576e188dacd2e09052 : addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach} Sep 18 19:50:22 fedora conmon[5009]: conmon e3576e188dacd2e09052 : terminal_ctrl_fd: 12 Sep 18 19:50:22 fedora conmon[5009]: conmon e3576e188dacd2e09052 : winsz read side: 15, winsz write side: 16 Sep 18 19:50:22 fedora conmon[5009]: conmon e3576e188dacd2e09052 : container PID: 5011 Sep 18 19:50:22 fedora conmon[5009]: conmon e3576e188dacd2e09052 : Failed to add inotify watch for /sys/fs/cgroup/memory.events Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Received: 5011" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=info msg="Got Conmon PID as 5009" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Created container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd in OCI runtime" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="found local resolver, using \"/run/systemd/resolve/resolv.conf\" to get the nameservers" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Starting container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd with command [sleep infinity]" Sep 18 19:50:22 fedora podman[4990]: 2024-09-18 19:50:22.431354485 +0200 CEST m=+0.169014877 container init e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd (image=docker.io/library/alpine:latest, name=systemd-testquadlet, PODMAN_SYSTEMD_UNIT=testquadlet.service) Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Started container e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd" Sep 18 19:50:22 fedora systemd[3376]: Started testquadlet.service - Sleep Container. Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Notify sent successfully" Sep 18 19:50:22 fedora podman[4990]: 2024-09-18 19:50:22.435077152 +0200 CEST m=+0.172737544 container start e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd (image=docker.io/library/alpine:latest, name=systemd-testquadlet, PODMAN_SYSTEMD_UNIT=testquadlet.service) Sep 18 19:50:22 fedora testquadlet[4990]: e3576e188dacd2e09052a48bd6901a4bd5820e1276dd39cb1808aae1f55b62bd Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Called run.PersistentPostRunE(/usr/bin/podman run --name=systemd-testquadlet --cidfile=/run/user/1000/testquadlet.cid --replace --rm --cgroups=split --sdnotify=conmon -d --log-level=debug docker.io/alpine sleep infinity)" Sep 18 19:50:22 fedora testquadlet[4990]: time="2024-09-18T19:50:22+02:00" level=debug msg="Shutting down engines" ```

So I can confirm that podman is working perfectly after this patch! :heart:

It really makes yesterday's hell worth it, knowing that it led to an improvement for podman.

I'll be keeping an eye on Fedora Updates to see when Podman 5.3.0 comes out. But in the meantime, I actually neutered PIA VPN on my host OS by converting its cgroup mountpoint into a non-writable file to prevent it from mounting a cgroup v1 on my system, which means that I can keep using the "broken" version of podman for now.

sudo umount /opt/piavpn/etc/cgroup/net_cls
sudo rmdir /opt/piavpn/etc/cgroup/net_cls
sudo touch /opt/piavpn/etc/cgroup/net_cls
sudo chmod 444 /opt/piavpn/etc/cgroup/net_cls

What a long, strange trip it has been. ;-)

containers / common

cgroups: ignore ownership for cgroup v1 controllers #2162