ntkme / unifi-systemd-units

:package: Systemd Units for UniFi OS.
https://github.com/ntkme/unifi-systemd
MIT License
19 stars 0 forks source link

[Question/Bug] OCI Runtime Error w/ systemd-podman, "bpf_prog_query(BPF_CGROUP_DEVICE) failed: function not implemented" #3

Open jtcressy opened 2 years ago

jtcressy commented 2 years ago

Have you encountered this issue before? I am currently on unifi OS v1.12.22 trying to run the wpa_supplicant container through unifi-systemd. I've never been able to get it to start these containers because of the following error:

Aug 27 21:36:17 unifi podman[908]: 2022-08-27 21:36:17.289840695 +0000 UTC m=+0.158163761 container create 3c7e707cad0388627dd65b6b5fc9ed526864403719a7920d3487c5886d9bdf70 (image=ghcr.io/ntkme/wpa_supplicant:edge, name=wpa_supplicant-eth8, org.opencontainers.image.description=:whale: Containerized wpa_supplicant., org.opencontainers.image.licenses=MIT, org.opencontainers.image.url=https://github.com/ntkme/wpa_supplicant, org.opencontainers.image.created=2022-06-01T06:44:34.598Z, org.opencontainers.image.title=wpa_supplicant, PODMAN_SYSTEMD_UNIT=container-wpa_supplicant@eth8.service, org.opencontainers.image.version=edge, org.opencontainers.image.revision=989dccc310bd9db903670438040e59dd050a3e4c, org.opencontainers.image.source=https://github.com/ntkme/wpa_supplicant, io.containers.autoupdate=image)
Aug 27 21:36:17 unifi podman[908]: 2022-08-27 21:36:17.209294951 +0000 UTC m=+0.077617966 image pull  ghcr.io/ntkme/wpa_supplicant:edge
Aug 27 21:36:17 unifi podman[908]: Error: OCI runtime error: runc: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: bpf_prog_query(BPF_CGROUP_DEVICE) failed: function not implemented
Aug 27 21:36:17 unifi systemd[1]: container-wpa_supplicant@eth8.service: Main process exited, code=exited, status=126/n/a

Is there an undocumented dependency on a newer kernel than what is shipped by default? For reference, output of uname -a on my udm pro:

bash-5.1# uname -a
Linux unifi 4.19.152-al-linux-v10.2.0-v1.12.22.4309-4105ace #1 SMP Thu May 19 09:34:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

The systemd-podman container runs just fine with the original host-level podman so there's something wrong with the newer podman build that is not agreeing with my hardware.

host podman:

# podman version
Version:            1.6.1
RemoteAPI Version:  1
Go Version:         go1.12.10
OS/Arch:            linux/arm64
# podman ps -a
CONTAINER ID  IMAGE                                       COMMAND               CREATED         STATUS             PORTS  NAMES
edfec8278b4f  ghcr.io/ntkme/systemd-podman:edge           /sbin/init            16 minutes ago  Up 16 minutes ago         unifi-systemd

vs systemd-podman (via unifi-systemd shell):

# unifi-systemd shell
bash-5.1# podman version
Client:       Podman Engine
Version:      4.1.1
API Version:  4.1.1
Go Version:   go1.18.4
Built:        Fri Jul 22 19:06:49 2022
OS/Arch:      linux/arm64
bash-5.1# podman ps -a
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
bash-5.1#
ntkme commented 2 years ago

I no longer own a UDM after switching to UDM-SE, for which I cannot test.

Can you please try running a container with --security-opt=seccomp=unconfined or --privileged inside systemd-podman and see if it works?

jtcressy commented 2 years ago

Neither of those options change the result. Both tries it continues failing to configure cgroups.

Error: OCI runtime error: runc: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: bpf_prog_query(BPF_CGROUP_DEVICE) failed: function not implemented

However I've been continuing to dig around and found this specific comment on an issue in unifios-utilities: https://github.com/unifi-utilities/unifios-utilities/issues/300#issuecomment-1127390170

they had enabled certain options in the kernel using fabianishere/udm-kernel-tools:

root@debian:~/udm-kernel# cat .github/config/config.local.udm 
CONFIG_FUSE_FS=y
CONFIG_TEST_BPF=y
CONFIG_BPF=y
CONFIG_BPFILTER=y
CONFIG_BPF_SYSCALL=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_BPF=y

however this looks to require downloading and compiling a custom kernel

The mystery, then, is how the host-level podman is able to create containers just fine

ntkme commented 2 years ago

If you try to trace the error from runc source code:

  1. https://github.com/opencontainers/runc/blob/5fd4c4d144137e991c4acebb2146ab1483a97925/libcontainer/cgroups/ebpf/ebpf_linux.go#L57
  2. https://github.com/opencontainers/runc/blob/5fd4c4d144137e991c4acebb2146ab1483a97925/libcontainer/cgroups/ebpf/ebpf_linux.go#L161
  3. https://github.com/opencontainers/runc/blob/5fd4c4d144137e991c4acebb2146ab1483a97925/libcontainer/cgroups/fs2/devices.go#L69

Now in the link number 3 for the stack above, you will see this function canSkipEBPFError:

    // We cannot ignore an eBPF load error if any rule if is a block rule or it
    // doesn't permit all access modes.
    //
    // NOTE: This will sometimes trigger in cases where access modes are split
    //       between different rules but to handle this correctly would require
    //       using ".../libcontainer/cgroup/devices".Emulator.
    for _, dev := range r.Devices {
        if !dev.Allow || !isRWM(dev.Permissions) {
            return false
        }
    }
jtcressy commented 2 years ago

Does this mean we need to mount /sys/fs/cgroup as rw in the systemd-podman container? right now it's bind-mounted read-only --volume /sys/fs/cgroup:/sys/fs/cgroup:ro

edit: ok mounting rw also has done nothing so idk what else to try

ntkme commented 2 years ago

Not sure. Free feel to try it and see if it make any difference.

I'm not really familiar with eBPF in podman/runc. I think it is more important to figure out where this eBPF rule is coming from, might be worth asking for help at:

Ubiquiti's custom kernel is really a pain to work with that crun does not work at all due to missing CONFIG_USER_NS, and now runc is having issue.

jtcressy commented 2 years ago

~~At this point I feel like it would be more worth my time to get rid of the UDMP and switch to the SE because this crap is ridiculous it should not be this hard to run a damn container~~

Update: I may have found a magic bullet: https://raw.githubusercontent.com/tianon/cgroupfs-mount/master/cgroupfs-mount

Following these steps:

# unifi-systemd stop
# curl https://raw.githubusercontent.com/tianon/cgroupfs-mount/master/cgroupfs-mount | sh -
# unifi-systemd start
# unifi-systemd shell
bash-5.1# podman run --network=host --rm docker.io/library/alpine date
Your kernel does not support pids limit capabilities or the cgroup is not mounted. PIDs limit discarded.
Mon Aug 29 18:53:12 UTC 2022
bash-5.1#

this works!!

It mounts extra cgroup controllers and seems to satisfy podman enough to run containers again

I think I can put this into a systemd unit and have it ran on startup by unifi-systemd @ntkme

ntkme commented 2 years ago

@jtcressy Nice find. Would you able to isolate which commands from that script is necessary and I’d happy to add it.

mayankst commented 2 years ago

@ntkme UDM-SE seems to have the same issue now as well with v3 firmware. Any idea how to resolve it? I running the script mentioned above but that didn't help either.