Error: OCI runtime error: crun: the requested cgroup controller `pids` is not available

Fisiu commented 9 months ago

Issue Description

Can not run a container. As an example I tried to run caddy.

Steps to reproduce the issue

Steps to reproduce the issue on openwrt.

opkg install conmon crun catatonit netavark podman external-protocol luci-proto-external
setup storage/netowork as required
podman run --name caddy -p 8080:80 caddy

Describe the results you received

Resolved "caddy" as an alias (/var/cache/containers/short-name-aliases.conf)
Trying to pull docker.io/library/caddy:latest...
Getting image source signatures
Copying blob 00acba933361 done   |
Copying blob 764cd8cdd6e6 done   |
Copying blob c6b39de5b339 done   |
Copying blob 4a8d53bc0303 done   |
Copying config a09733d8f0 done   |
Writing manifest to image destination
WARN[0005] Failed to add conmon to cgroupfs sandbox cgroup: creating cgroup path /libpod_parent/conmon: write /sys/fs/cgroup/cgroup.subtree_control: invalid argument
Error: OCI runtime error: crun: the requested cgroup controller `pids` is not available

Describe the results you expected

I would expect that container with caddy is running and listening on host port 8080.

podman info output

host:
  arch: arm64
  buildahVersion: 1.33.2
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 98.93
    systemPercent: 0.53
    userPercent: 0.53
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: openwrt
    version: 23.05.2
  eventLogger: none
  freeLocks: 2047
  hostname: nanopi
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.1.25
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 3418849280
  memTotal: 4091543552
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: Unknown
      path: /usr/lib/podman/aardvark-dns
      version: aardvark-dns 1.9.0
    package: Unknown
    path: /usr/lib/podman/netavark
    version: netavark 1.9.0
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/bin/crun
    version: "crun version 1.12\ncommit: \nrundir: /run/crun\nspec: 1.0.0\n+SELINUX
      +APPARMOR +CAP +SECCOMP +EBPF +YAJL"
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 0
  uptime: 17h 21m 6.00s (Approximately 0.71 days)
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /opt/storage/podman
  graphRootAllocated: 172677963776
  graphRootUsed: 180658176
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /opt/storage/podman/volumes
version:
  APIVersion: 4.8.0
  Built: 1706521279
  BuiltTime: Mon Jan 29 09:41:19 2024
  GitCommit: ""
  GoVersion: go1.21.5
  Os: linux
  OsArch: linux/arm64
  Version: 4.8.0

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

No

Additional environment details

FriendlyWrt 23.05.2 based openwrt 23.05.2

Additional information

It looks like it's the same issue as reported in https://github.com/containers/podman/issues/16960. Please let me answer as asked in mention issue report.

cat /sys/fs/cgroup/cgroup.controllers returns empty result while echo +pids > /sys/fs/cgroup/cgroup.subtree_control returns: -bash: echo: write error: No such file or directory

rhatdan commented 9 months ago

@giuseppe PTAL

giuseppe commented 9 months ago

is the kernel compiled without cgroups?

Not sure we can even support this configuration, please try if podman run --cgroups=disabled makes any difference

Fisiu commented 9 months ago

When I add --cgroups=disables it looks like container starts without an error:

podman run --cgroups=disabled --rm --name caddy -p 8080:80 caddy
{"level":"info","ts":1707136299.9113674,"msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1707136299.917296,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
{"level":"warn","ts":1707136299.9178731,"logger":"http.auto_https","msg":"server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server","server_name":"srv0","http_port":80}
{"level":"info","ts":1707136299.9187245,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0x40004b0500"}
{"level":"info","ts":1707136299.9201748,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"level":"info","ts":1707136299.9217377,"msg":"autosaved config (load with --resume flag)","file":"/config/caddy/autosave.json"}
{"level":"info","ts":1707136299.921806,"msg":"serving initial configuration"}
{"level":"info","ts":1707136299.9223545,"logger":"tls","msg":"cleaning storage unit","storage":"FileStorage:/data/caddy"}
{"level":"info","ts":1707136299.9234114,"logger":"tls","msg":"finished cleaning storage units"}

But curl does not get anything on 127.0.0.1:8080. No idea if it's related to disabled cgroups.

However, CGROUPS are enabled in kernel. Not sure if there is anything else required in kernel to make podman.

# zcat /proc/config.gz | grep CGROUP
CONFIG_CGROUPS=y
# CONFIG_CGROUP_FAVOR_DYNMODS is not set
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_MISC is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_BLK_CGROUP_RWSTAT=y
# CONFIG_BLK_CGROUP_IOLATENCY is not set
# CONFIG_BLK_CGROUP_IOCOST is not set
# CONFIG_BLK_CGROUP_IOPRIO is not set
# CONFIG_BFQ_CGROUP_DEBUG is not set
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NET_CLS_CGROUP=m
CONFIG_CGROUP_NET_PRIO=y
CONFIG_CGROUP_NET_CLASSID=y

giuseppe commented 9 months ago

can it be they are not mounted?

What do you get with cat /proc/self/mountinfo? Are cgroups mounted correctly under /sys/fs/cgroup?

Fisiu commented 9 months ago

mountinfo shows:

# cat /proc/self/mountinfo 
24 33 0:23 / /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw
25 33 0:24 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
26 33 0:5 / /dev rw,nosuid,relatime - devtmpfs udev rw,size=1986108k,nr_inodes=496527,mode=755
27 26 0:25 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=000
28 33 0:26 / /run rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw,size=399568k,mode=755
33 1 0:27 / / rw,noatime - overlay overlay rw,lowerdir=/root,upperdir=/data/root,workdir=/data/work
34 25 0:30 / /proc rw,nosuid,nodev,noexec,noatime - proc proc rw
35 24 0:31 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - cgroup2 cgroup2 rw,nsdelegate
38 33 0:34 / /tmp rw,nosuid,nodev,noatime - tmpfs tmpfs rw
37 26 0:33 / /dev rw,nosuid,noexec,noatime - tmpfs tmpfs rw,size=512k,mode=755
39 37 0:35 / /dev/pts rw,nosuid,noexec,noatime - devpts devpts rw,mode=600,ptmxmode=000
40 35 0:36 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
41 35 0:37 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
42 35 0:38 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
43 35 0:39 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
44 35 0:40 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory
45 35 0:41 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
46 35 0:42 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
47 35 0:43 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
48 35 0:44 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event
49 35 0:45 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio
50 35 0:46 / /sys/fs/cgroup/hugetlb rw,relatime - cgroup cgroup rw,hugetlb
51 35 0:47 / /sys/fs/cgroup/pids rw,relatime - cgroup cgroup rw,pids
52 24 0:7 / /sys/kernel/debug rw,noatime - debugfs debugfs rw
53 24 0:48 / /sys/fs/bpf rw,nosuid,nodev,noexec,noatime - bpf bpffs rw,mode=700
54 24 0:49 / /sys/fs/pstore rw,noatime - pstore pstore rw
55 33 259:1 / /opt/storage rw,relatime - ext4 /dev/nvme0n1p1 rw
56 33 259:2 / /opt/docker rw,relatime - ext4 /dev/nvme0n1p2 rw
60 38 0:54 / /tmp/run/blockd rw,relatime - autofs mountd(pid3655) rw,fd=7,pgrp=3655,timeout=30,minproto=5,maxproto=5,indirect
62 28 0:26 /netns /run/netns rw,nosuid,nodev,noexec,relatime shared:1 - tmpfs tmpfs rw,size=399568k,mode=755

giuseppe commented 9 months ago

the mount configuration is wrong. You both have a cgroup2 mount (/ /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - cgroup2 cgroup2 rw,nsdelegate) and on top of that, you've mounted the cgroupv1 controllers.

A controller can only be part of cgroupv1 or cgroupv2. In your case, it appears you are using cgroupv1 to manage the controllers, but podman detects cgroupv2 since there is a cgroupv2 mount.

You either need to use cgroup v2, or make sure there is no cgroupv2 mount (use a tmpfs for /sys/fs/cgroup).

I am closing this issue because this is not a configuration we support, but feel free to comment further.

containers / podman