containers / conmon

An OCI container runtime monitor.
Apache License 2.0
420 stars 128 forks source link

Container creation as root, works as rootless user but not root (podman 4.8.3, conmon 2.1.8) #493

Open nktrmb opened 8 months ago

nktrmb commented 8 months ago

Currently getting undesirable behavior when attempting to create a container from a root user, but when performing the same or similar action from a rootless user the container is created without issues. This is the same for the custom container or if the container is simply hello-world.

Error from root user: Error: container create failed (no logs from conmon): conmon bytes "": readObjectStart: expect { or n, but found , error found in #0 byte of ...||..., bigger context ...||...

podman info:

  arch: arm
  buildahVersion: 1.33.2
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: 6d88cb3672a3dceeb4b045a92dc4d4285c9f4efd'
  cpuUtilization:
    idlePercent: 49.84
    systemPercent: 22.96
    userPercent: 27.21
  cpus: 2
  databaseBackend: sqlite
  distribution:
    codename: nanbield
    distribution: trmb-judo
    version: 0.7.0.dev0-2024.1.4
  eventLogger: journald
  freeLocks: 2047
  hostname: mp1010
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.1.69-g-g
  linkmode: dynamic
  logDriver: journald
  memFree: 3126398976
  memTotal: 4098801664
  networkBackend: cni
  networkBackendInfo:
    backend: cni
    dns: {}
  ociRuntime:
    name: runc
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.1.10+dev
      commit: v1.1.10-2-gf3446b1e-dirty
      spec: 1.0.2-dev
      go: go1.20.13
      libseccomp: 2.5.5
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.0-beta.0+dev
      commit: unknown
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 0
  swapTotal: 0
  uptime: 0h 1m 20.00s
  variant: v7
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 0
    stopped: 5
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /root/.local/share/containers/storage
  graphRootAllocated: 28565897216
  graphRootUsed: 1130864640
  graphStatus:
    Backing Filesystem: overlayfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /root/.local/share/containers/storage/temp
  transientStore: false
  volumePath: /root/.local/share/containers/storage/volumes
version:
  APIVersion: 4.8.3-dev
  Built: 1702297875
  BuiltTime: Mon Dec 11 12:31:15 2023
  GitCommit: 0ec4c8b1d7d6fc273d50064f87a6c0b2d269fdcd
  GoVersion: go1.20.13
  Os: linux
  OsArch: linux/arm
  Version: 4.8.3-dev

I also updated to 2.1.10 of conmon, and different versions of podman (4.7.3-> latest) and it was the same result. I originally had the data store locations as /var/lib/containers/storage and /run/containers/storage, (i.e. the default) but this also did not get around this error.

uname -a Linux device-name 6.1.69-g-g #1 SMP PREEMPT Wed Feb 7 15:26:29 UTC 2024 armv7l GNU/Linux

nktrmb commented 8 months ago

After further review, I managed to get it to create containers (as root) but only if I downgraded the cgroup version to V1. I have similar firmware on another device (similar as in its the base yocto for the device I am using), and that works with everything at cgroup v2. Currently looking into kernel configuration options that might be necessary on the main device.

haircommander commented 7 months ago

do you have output from conmon to share? if you're using podman it should be in the journal

nktrmb commented 6 months ago

Logs from Journalctl from conmon (Used the '/' to filter the logs, and also tried grepping for 'conmon' but there were no additional logs):

Feb 27 17:28:06 mp1010 kernel: cni-podman0: port 1(veth89018c49) entered forwarding state
Feb 27 17:28:06 mp1010 systemd-networkd[551]: veth89018c49: Gained carrier
Feb 27 17:28:06 mp1010 NetworkManager[542]: <info>  [1709054886.9482] device (veth89018c49): carrier: link connected
Feb 27 17:28:06 mp1010 NetworkManager[542]: <info>  [1709054886.9493] device (cni-podman0): carrier: link connected
Feb 27 17:28:06 mp1010 systemd-networkd[551]: cni-podman0: Gained carrier
Feb 27 17:28:06 mp1010 avahi-daemon[448]: Joining mDNS multicast group on interface cni-podman0.IPv4 with address 10.88.0.1.
Feb 27 17:28:06 mp1010 avahi-daemon[448]: New relevant interface cni-podman0.IPv4 for mDNS.
Feb 27 17:28:06 mp1010 avahi-daemon[448]: Registering new address record for 10.88.0.1 on cni-podman0.IPv4.
Feb 27 17:28:07 mp1010 systemd[1]: Created slice Slice /machine.
Feb 27 17:28:07 mp1010 systemd[1]: Started libpod-conmon-ad3dabb2995145d288173985ae7c855a36cd54930afa9a840939342d678795a7.scope.
Feb 27 17:28:07 mp1010 systemd[1]: Started libcontainer container ad3dabb2995145d288173985ae7c855a36cd54930afa9a840939342d678795a7.
Feb 27 17:28:07 mp1010 systemd[1]: libpod-ad3dabb2995145d288173985ae7c855a36cd54930afa9a840939342d678795a7.scope: Deactivated successfully.
Feb 27 17:28:07 mp1010 conmon[1394]: conmon ad3dabb2995145d28817 <error>: Failed to receive console file descriptor Communication error on send
Feb 27 17:28:07 mp1010 systemd-networkd[551]: veth89018c49: Link DOWN
Feb 27 17:28:07 mp1010 systemd-networkd[551]: veth89018c49: Lost carrier
Feb 27 17:28:07 mp1010 kernel: cni-podman0: port 1(veth89018c49) entered disabled state
Feb 27 17:28:07 mp1010 kernel: device veth89018c49 left promiscuous mode
Feb 27 17:28:07 mp1010 kernel: cni-podman0: port 1(veth89018c49) entered disabled state
Feb 27 17:28:07 mp1010 systemd-networkd[551]: cni-podman0: Lost carrier
Feb 27 17:28:07 mp1010 systemd[1]: run-netns-netns\x2d08d81828\x2db6e2\x2d5b34\x2d653f\x2d907d6a0da988.mount: Deactivated successfully.
Feb 27 17:28:07 mp1010 systemd[1]: data-root-podman-containers-storage-overlay-ff40ac06a0334fef199ba31f3692b2bff4ab123fee4e8c2d0201631ca67e97c5-merged.mount: Deactivated successfully.
Feb 27 17:28:07 mp1010 systemd[1]: data-root-podman-containers-storage-overlay\x2dcontainers-ad3dabb2995145d288173985ae7c855a36cd54930afa9a840939342d678795a7-userdata-shm.mount: Deactivated successfully.
Feb 27 17:28:07 mp1010 systemd[1]: data-root-podman-containers-storage-overlay.mount: Deactivated successfully.
Feb 27 17:28:07 mp1010 systemd[1]: libpod-conmon-ad3dabb2995145d288173985ae7c855a36cd54930afa9a840939342d678795a7.scope: Deactivated successfully.
Feb 27 17:28:08 mp1010 systemd-networkd[551]: cni-podman0: Gained IPv6LL

Below is the snippet of the --log-level=debug argument logs when trying to run any container.

DEBU[0000] /usr/bin/conmon messages will be logged to syslog 
DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c 1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349 -u 1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349 -r /usr/bin/runc -b /data/root/podman/containers/storage/overlay-containers/1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349/userdata -p /run/containers/storage/overlay-containers/1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349/userdata/pidfile -n magical_diffie --exit-dir /run/libpod/exits --persist-dir /run/libpod/persist/1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349 --full-attach -s -l journald --log-level debug --syslog -t --conmon-pidfile /run/containers/storage/overlay-containers/1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /data/root/podman/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg cni --exit-command-arg --volumepath --exit-command-arg /data/root/podman/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349]"
INFO[0000] Running conmon under slice machine.slice and unitName libpod-conmon-1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349.scope 
DEBU[0000] Cleaning up container 1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349 
DEBU[0000] Tearing down network namespace at /run/netns/netns-404f2334-04f5-e68a-6139-f79101f3f101 for container 1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349 
DEBU[0001] Unmounted container "1de7586de255062a5230fc5a3f3c5a44663b1f4419e922433f8cf87978dc2349" 
DEBU[0001] ExitCode msg: "container create failed (no logs from conmon): conmon bytes \"\": readobjectstart: expect { or n, but found \x00, error found in #0 byte of ...||..., bigger context ...||..." 
Error: container create failed (no logs from conmon): conmon bytes "": readObjectStart: expect { or n, but found , error found in #0 byte of ...||..., bigger context ...||...

Version:

~# podman --version
podman version 5.0.2-dev
~# conmon --version
conmon version 2.1.10
commit: affab49967eb62f75d2a47398344ab053326289f