containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.26k stars 2.37k forks source link

podman with custom `static_dir` slow to tear-down containers #19938

Closed srcshelton closed 1 year ago

srcshelton commented 1 year ago

Issue Description

Whilst podman is able to start containers from even very large images fast enough that it feels interactively instant, shutting them down again takes what feels like a long time.

I wonder whether this is because I'm running the btrfs storage-driver, because I'm not running on a system managed by systemd (and so perhaps some call-back is failing?), or whether some default timeout is being adhered to even if the container processes have exited near the start of the timeout period?

Steps to reproduce the issue

$ sudo podman run -it --rm docker.io/alpine:latest ; date
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob 7264a8db6415 skipped: already exists
Copying config 7e01a0d0a1 done
Writing manifest to image destination
/ # date ; exit
Tue Sep 12 08:40:27 UTC 2023
Tue 12 Sep 09:40:47 BST 2023

Describe the results you received

There is a 20s pause between the container process exiting and the container itself having been fully torn-down.

N.B. This is not a new problem - this system has been upgraded from podman 2.x with the VFS storage-driver, but has always been slow to tear-down containers.

Describe the results you expected

In other scenarios, if the process within the container exits then the container is torn-down effectively instantly.

podman run has a --stop-timeout which apparently defaults to 10 seconds, but the stop delay is actually twice that - and in this case the container process has itself exited, so I'd imagine that any timeout dependent on processes within the container (as opposed to related host processes such as crun/runc) should not be adhered to?

podman info output

host:
  arch: amd64
  buildahVersion: 1.31.2
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: app-containers/conmon-2.1.8
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.1.8, commit: 00e08f4a9ca5420de733bf542b930ad58e1a7e7d'
  cpuUtilization:
    idlePercent: 95.44
    systemPercent: 2.05
    userPercent: 2.5
  cpus: 8
  databaseBackend: sqlite
  distribution:
    distribution: gentoo
    version: "2.14"
  eventLogger: file
  freeLocks: 2021
  hostname: dellr330
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.5.2-gentoo
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 33188667392
  memTotal: 67331043328
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: app-containers/aardvark-dns-1.7.0
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: app-containers/netavark-1.7.0
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: app-containers/crun-1.9
    path: /usr/bin/crun
    version: |-
      crun version 1.9
      commit: a538ac4ea1ff319bcfe2bf81cb5c6f687e2dc9d3
      rundir: /var/run/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 42949652480
  swapTotal: 42949652480
  uptime: 8h 6m 42.00s (Approximately 0.33 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5000
    PullFromMirror: ""
  search:
  - docker.io
  - docker.pkg.github.com
  - quay.io
  - public.ecr.aws
  - registry.fedoraproject.org
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 25
    paused: 0
    running: 24
    stopped: 1
  graphDriverName: btrfs
  graphOptions: {}
  graphRoot: /mnt/podman-storage
  graphRootAllocated: 2000397795328
  graphRootUsed: 73760190464
  graphStatus:
    Build Version: Btrfs v6.3.3
    Library Version: "102"
  imageCopyTmpDir: /var/tmp/.private/root
  imageStore:
    number: 120
  runRoot: /var/run/podman
  transientStore: false
  volumePath: /space/podman/volumes
version:
  APIVersion: 4.6.2
  Built: 1693253621
  BuiltTime: Mon Aug 28 21:13:41 2023
  GitCommit: 5db42e86862ef42c59304c38aa583732fd80f178
  GoVersion: go1.20.7
  Os: linux
  OsArch: linux/amd64
  Version: 4.6.2

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

Host is running openrc, not systemd.

Additional information

The size of the image (within reason) does not seem to have a significant effect on the tear-down time - for a 1.08GB image, the result is:

$ sudo podman run -it --rm --entrypoint /bin/sh localhost/gentoo-build:latest ; date
/ # date ; exit
Tue Sep 12 10:03:07 BST 2023
Tue 12 Sep 10:03:27 BST 2023

… so the delay appears to be a consistent 20s.

giuseppe commented 1 year ago

looks like it takes a while to delete the btrfs volume.

Can you try with the overlay driver?

srcshelton commented 1 year ago

It's been like this since before I switched to the btrfs driver - if anything, it was worse with the overlay driver (although I didn't gather any timing on this).

(Is there any way to switch drivers without having to podman system reset and save/restore all images?)

Or, is there any instrumentation I can add/debug options I can enable that will get the innards of podman/libpod to tell us how much time is taken in removing the btrfs subvolume, to rule this in or out?

I believe that subvolume removal should be instant, and whenever I've removed subvolumes manually on the same system, there's never been any noticeable delay.

giuseppe commented 1 year ago

you could experiment with overlay on a different store:

podman --storage-driver overlay --root /somewhere/else ...

and see how that performs.

You cannot switch drivers without recreating the images, what you could try is to copy them:

$ podman --root /new/storage pull "containers-storage:[btrfs@/old/path+/run/dir]IMAGE"
Luap99 commented 1 year ago

The 20 seconds is the delay uses to wait for the exit code to appear in the db. You can see that with --log-level debug.

So the real problem is is likely the podman container cleanup command not working correctly or something wrong in conmon maybe.

srcshelton commented 1 year ago

The 20 seconds is the delay uses to wait for the exit code to appear in the db. You can see that with --log-level debug.

So the real problem is is likely the podman container cleanup command not working correctly or something wrong in conmon maybe.

Setup:

$ sudo podman run -it --rm --log-level debug docker.io/alpine:latest ; date
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called run.PersistentPreRunE(podman run -it --rm --log-level debug docker.io/alpine:latest) 
DEBU[0000] Using conmon: "/usr/libexec/podman/conmon"   
DEBU[0000] Using graph driver btrfs                     
DEBU[0000] Using graph root /mnt/podman-storage         
DEBU[0000] Using run root /var/run/podman               
DEBU[0000] Using static dir /space/podman/static        
DEBU[0000] Using tmp dir /var/run/libpod                
DEBU[0000] Using volume path /space/podman/volumes      
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "btrfs" 
DEBU[0000] Initializing event backend file              
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Successfully loaded 1 networks               
DEBU[0000] Pulling image docker.io/alpine:latest (policy: missing) 
DEBU[0000] Looking up image "docker.io/alpine:latest" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Trying "docker.io/library/alpine:latest" ... 
DEBU[0000] parsed reference into "[btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Found image "docker.io/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage 
DEBU[0000] Found image "docker.io/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage ([btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb) 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Looking up image "docker.io/library/alpine:latest" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Trying "docker.io/library/alpine:latest" ... 
DEBU[0000] parsed reference into "[btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Found image "docker.io/library/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage 
DEBU[0000] Found image "docker.io/library/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage ([btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb) 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Looking up image "docker.io/alpine:latest" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Trying "docker.io/library/alpine:latest" ... 
DEBU[0000] parsed reference into "[btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Found image "docker.io/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage 
DEBU[0000] Found image "docker.io/alpine:latest" as "docker.io/library/alpine:latest" in local containers storage ([btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb) 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Inspecting image 7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Inspecting image 7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb 
DEBU[0000] Inspecting image 7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb 
DEBU[0000] Inspecting image 7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb 
DEBU[0000] Inspecting image 7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb 
DEBU[0000] using systemd mode: false                    
DEBU[0000] No hostname set; container's hostname will default to runtime default 
DEBU[0000] Loading seccomp profile from "/usr/share/containers/seccomp.json" 
DEBU[0000] Allocated lock 27 for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0000] parsed reference into "[btrfs@/mnt/podman-storage+/var/run/podman]@7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] exporting opaque data as blob "sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb" 
DEBU[0000] Created container "509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938" 
DEBU[0000] Container "509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938" has work directory "/mnt/podman-storage/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata" 
DEBU[0000] Container "509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938" has run directory "/var/run/podman/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata" 
DEBU[0000] Handling terminal attach                     
INFO[0000] Received shutdown.Stop(), terminating!        PID=18606
DEBU[0000] Enabling signal proxying                     
DEBU[0000] Mounted container "509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938" at "/mnt/podman-storage/btrfs/subvolumes/5e0f0219a6a9ea6b9515a5517ece74918ee2aa48078a1acd9625f0aca195251e" 
DEBU[0000] Made network namespace at /var/run/netns/netns-df26baeb-2b07-198b-6297-4e5edf5a92c1 for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0000] Created root filesystem for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 at /mnt/podman-storage/btrfs/subvolumes/5e0f0219a6a9ea6b9515a5517ece74918ee2aa48078a1acd9625f0aca195251e 
[DEBUG netavark::network::validation] "Validating network namespace..."
                                                                       [DEBUG netavark::commands::setup] "Setting up..."
                                                                                                                        [INFO  netavark::firewall] Using iptables firewall driver
            [DEBUG netavark::network::bridge] Setup network podman
                                                                  [DEBUG netavark::network::bridge] Container interface name: eth0 with IP addresses [172.18.95.73/16]
 [DEBUG netavark::network::bridge] Bridge name: podman0 with IP addresses [172.18.0.1/16]
                                                                                         [DEBUG netavark::network::core_utils] Setting sysctl value for net.ipv4.ip_forward to 1
           [DEBUG netavark::network::core_utils] Setting sysctl value for /proc/sys/net/ipv6/conf/eth0/autoconf to 0
                                                                                                                    [INFO  netavark::network::netlink] Adding route (dest: 0.0.0.0/0 ,gw: 172.18.0.1, metric 100)
                                            [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-1D8721804F16F created on table nat
                                                                                                                                             [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_2 created on table filter
                                                                          [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_ISOLATION_3 created on table filter
       [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_FORWARD exists on table filter
                                                                                                    [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_FORWARD exists on table filter
                            [DEBUG netavark::firewall::varktables::helpers] rule -d 172.18.0.0/16 -j ACCEPT created on table nat and chain NETAVARK-1D8721804F16F
                                                                                                                                                                 [DEBUG netavark::firewall::varktables::helpers] rule ! -d 224.0.0.0/4 -j MASQUERADE created on table nat and chain NETAVARK-1D8721804F16F
                                                                                                                                     [DEBUG netavark::firewall::varktables::helpers] rule -s 172.18.0.0/16 -j NETAVARK-1D8721804F16F created on table nat and chain POSTROUTING
                                                                                                          [DEBUG netavark::firewall::varktables::helpers] rule -d 172.18.0.0/16 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT created on table filter and chain NETAVARK_FORWARD
                                                                                                                  [DEBUG netavark::firewall::varktables::helpers] rule -s 172.18.0.0/16 -j ACCEPT created on table filter and chain NETAVARK_FORWARD
                                                                               [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-SETMARK exists on table nat
             [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-SETMARK exists on table nat
                                                                                                                [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-MASQ exists on table nat
                                           [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-MASQ exists on table nat
                                                                                                                                           [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-DNAT exists on table nat
                                                                      [DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-DNAT exists on table nat
 [DEBUG netavark::firewall::varktables::helpers] rule -j MARK  --set-xmark 0x2000/0x2000 exists on table nat and chain NETAVARK-HOSTPORT-SETMARK
                                                                                                                                                [DEBUG netavark::firewall::varktables::helpers] rule -j MASQUERADE -m comment --comment 'netavark portfw masq mark' -m mark --mark 0x2000/0x2000 exists on table nat and chain NETAVARK-HOSTPORT-MASQ
           [DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL exists on table nat and chain PREROUTING
                                                                                                                                                               [DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL exists on table nat and chain OUTPUT
                                                                                                                                          [DEBUG netavark::commands::setup] {
                "podman": StatusBlock {
                                                   dns_search_domains: Some(
                                                                                            [],
                                                                                                           ),
                                                                                                                         dns_server_ips: Some(
                                                                                                                                                              [],
                                                                                                                                                                             ),
                      interfaces: Some(
                                                       {
                                                                            "eth0": NetInterface {
                                                                                                                          mac_address: "fe:84:9e:e3:96:eb",
                                                                                                                                                                                   subnets: Some(
                                                        [
                                                                                         NetAddress {
                                                                                                                                         gateway: Some(
                                                                                                                                                                                               172.18.0.1,
                                                                         ),
                                                                                                               ipnet: 172.18.95.73/16,
                                                                                                                                                                      },
                               ],
                                                         ),
                                                                               },
                                                                                                 },
                                                                                                               ),
                                                                                                                         },
                                                                                                                               }
                                                                                                                                [DEBUG netavark::commands::setup] "Setup complete"
             DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription 
DEBU[0000] Setting Cgroup path for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 to /libpod_parent/libpod-509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[0000] Workdir "/" resolved to host path "/mnt/podman-storage/btrfs/subvolumes/5e0f0219a6a9ea6b9515a5517ece74918ee2aa48078a1acd9625f0aca195251e" 
DEBU[0000] Created OCI spec for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 at /mnt/podman-storage/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata/config.json 
DEBU[0000] /usr/libexec/podman/conmon messages will be logged to syslog 
DEBU[0000] running conmon: /usr/libexec/podman/conmon    args="[--api-version 1 -c 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 -u 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 -r /usr/bin/crun -b /mnt/podman-storage/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata -p /var/run/podman/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata/pidfile -n zen_hertz --exit-dir /var/run/libpod/exits --full-attach -l k8s-file:/mnt/podman-storage/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata/ctr.log --log-level debug --syslog -t --conmon-pidfile /var/run/podman/btrfs-containers/509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /mnt/podman-storage --exit-command-arg --runroot --exit-command-arg /var/run/podman --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /space/podman/volumes --exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg btrfs --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938]"
DEBU[0000] Received: 19209                              
INFO[0000] Got Conmon PID as 19207                      
DEBU[0000] Created container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 in OCI runtime 
DEBU[0000] Adding nameserver(s) from network status of '[]' 
DEBU[0000] Adding search domain(s) from network status of '[]' 
DEBU[0000] Attaching to container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0000] Starting container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 with command [/bin/sh] 
DEBU[0000] Received a resize event: {Width:165 Height:46} 
DEBU[0000] Started container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0000] Notify sent successfully

Tear-down:

/ # date ; exit
Wed Sep 13 09:45:46 UTC 2023
DEBU[0028] Exceeded conmon timeout waiting for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 to exit 
DEBU[0028] Checking if container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 should restart 
DEBU[0028] Removing container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0028] Cleaning up container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0028] Tearing down network namespace at /var/run/netns/netns-df26baeb-2b07-198b-6297-4e5edf5a92c1 for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
[DEBUG netavark::commands::teardown] "Tearing down.."
[INFO  netavark::firewall] Using iptables firewall driver
[INFO  netavark::network::bridge] removing bridge podman0
[DEBUG netavark::commands::teardown] "Teardown complete"
DEBU[0029] Successfully cleaned up container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0029] Unmounted container "509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938" 
DEBU[0029] Removing all exec sessions for container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 
DEBU[0029] Container 509e41f647ed96e58846135706613adbc0d01e0a784b78d912276faa98bdc938 storage is already unmounted, skipping... 
DEBU[0029] Called run.PersistentPostRunE(podman run -it --rm --log-level debug docker.io/alpine:latest) 
DEBU[0029] Shutting down engines                        
Wed 13 Sep 10:46:07 BST 2023

Unfortunately there's no per-debug-statement timing information, but Exceeded conmon timeout waiting for container looks suspicious.

Luap99 commented 1 year ago

Exceeded conmon timeout waiting for container yes this is the line printed after 20s of waiting https://github.com/containers/podman/blob/e8c4d7966b8bf3b2f592e7dbe1bc7d5e556f826d/libpod/container_api.go#L600-L621

Luap99 commented 1 year ago

Can you run podman ps -a in another terminal while it is waiting, then we see what state the container is in.

srcshelton commented 1 year ago

you could experiment with overlay on a different store:

podman --storage-driver overlay --root /somewhere/else ...

and see how that performs.

Ah - you might be on to something here!

Running from tmpfs:

$ sudo podman run -it --rm --storage-driver overlay --root /tmp/podman-alt-root docker.io/library/alpine:latest ; date
/ # date ; exit
Wed Sep 13 09:53:52 UTC 2023
Wed 13 Sep 10:53:52 BST 2023

… running from NVMe, but using the overlay driver:

$ sudo podman run -it --rm --storage-driver overlay --root /mnt/nvme/podman-alt docker.io/library/alpine:latest ; date
/ # date ; exit
Wed Sep 13 09:56:05 UTC 2023
Wed 13 Sep 10:56:05 BST 2023

So the issue does seem to be related to the btrfs storage-driver.

… apologies, my recollection is that it had always been slow even before migrating to btrfs (which was a change I only made a few months ago).

Also, thanks for the image-copy command - I wasn't aware that was an option!

srcshelton commented 1 year ago

Can you run podman ps -a in another terminal while it is waiting, then we see what state the container is in.

docker.io/alpine:latest  /bin/sh  14 seconds ago  Exited (0) 2 seconds ago

… so the state seems to have moved to Exited even though the instance is waiting on conmon (which only appears to happen with the btrfs driver).

rhatdan commented 1 year ago

None of the core maintainers works on the BTRFS driver, so unlikely to be fixed unless community members step up.

srcshelton commented 1 year ago

Is it worth filing a bug against https://github.com/containers/conmon, or is that the same audience of maintainers/contributors as here - or indeed, does the above debugging information indicate that the issue is (triggered) within podman rather than conmon?

giuseppe commented 1 year ago

hard to see where is the error without being able to reproduce it.

I'd suggest you manually try running podman container cleanup while the cleanup hangs to see if you can catch any useful error, or use something like /usr/share/bcc/tools/execsnoop to detect what processes run when the container terminates

srcshelton commented 1 year ago

I've reproduced this issue on a couple of systems which are using the overlay driver, one amd64 and one arm64.

amd64:

$ sudo podman run -it --rm docker.io/alpine:latest ; date
Password:
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob 7264a8db6415 done
Copying config 7e01a0d0a1 done
Writing manifest to image destination
/ # date ; exit
Thu Sep 14 14:18:01 UTC 2023
Thu 14 Sep 15:18:22 BST 2023

Configuration:

host:
  arch: amd64
  buildahVersion: 1.31.2
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: app-containers/conmon-2.1.8
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: 00e08f4a9ca5420de733bf542b930ad58e1a7e7d'
  cpuUtilization:
    idlePercent: 96.75
    systemPercent: 2.85
    userPercent: 0.4
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: gentoo
    version: "2.14"
  eventLogger: file
  freeLocks: 2032
  hostname: apu2e
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.1.4-gentoo
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 1112412160
  memTotal: 4115214336
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: app-containers/aardvark-dns-1.7.0
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: app-containers/netavark-1.7.0
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: app-containers/crun-1.9
    path: /usr/bin/crun
    version: |-
      crun version 1.9
      commit: a538ac4ea1ff319bcfe2bf81cb5c6f687e2dc9d3
      rundir: /var/run/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 2684350464
  swapTotal: 2684350464
  uptime: 5306h 12m 48.00s (Approximately 221.08 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
    PullFromMirror: ""
  search:
  - docker.io
  - docker.pkg.github.com
  - quay.io
  - public.ecr.aws
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 4
    stopped: 2
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "false"
  graphRoot: /storage/containers/podman/storage
  graphRootAllocated: 31526481920
  graphRootUsed: 16571924480
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /storage/containers/podman/tmp
  imageStore:
    number: 315
  runRoot: /storage/containers/podman/run
  transientStore: false
  volumePath: /storage/containers/podman/volumes
version:
  APIVersion: 4.6.2
  Built: 1693466451
  BuiltTime: Thu Aug 31 08:20:51 2023
  GitCommit: 5db42e86862ef42c59304c38aa583732fd80f178
  GoVersion: go1.20.7
  Os: linux
  OsArch: linux/amd64
  Version: 4.6.2

Concurrent containers:

$ sudo podman ps -a
CONTAINER ID  IMAGE                                          COMMAND               CREATED      STATUS                  PORTS       NAMES
e11efc411a22  localhost/service.sys-apps.irqbalance:1.9.2    --foreground --pi...  13 days ago  Up 13 days                          openrc-irqbalance-1.9.2
a4b0d0ddda5b  localhost/service.app-admin.metalog:20220214   -p /var/run/metal...  13 days ago  Up 13 days                          openrc-metalog-20220214
929329c36baf  localhost/service.net-misc.openntpd:6.2_p3-r2  -d -f /etc/opennt...  13 days ago  Up 13 days                          openrc-openntpd-6.2_p3-r2
8fab39f502fa  localhost/service.net-proxy.privoxy:3.0.33     --no-daemon --pid...  13 days ago  Up 13 days                          openrc-privoxy-3.0.33

arm64:

$ sudo podman run -it --rm docker.io/alpine:latest ; date
Password:
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob 9fda8d8052c6 done
Copying config f6648c04cd done
Writing manifest to image destination
/ # date ; exit
Thu Sep 14 14:18:51 UTC 2023
Thu 14 Sep 15:19:11 BST 2023

Configuration:

host:
  arch: arm64
  buildahVersion: 1.31.2
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: app-containers/conmon-2.1.7
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.1.7, commit: f633919178f6c8ee4fb41b848a056ec33f8d707d'
  cpuUtilization:
    idlePercent: 95.94
    systemPercent: 3.32
    userPercent: 0.74
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: gentoo
    version: "2.14"
  eventLogger: file
  freeLocks: 2044
  hostname: turnpike
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.1.21-v8+
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 2312245248
  memTotal: 8190758912
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: app-containers/aardvark-dns-1.7.0
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: app-containers/netavark-1.7.0
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: app-containers/crun-1.8.5
    path: /usr/bin/crun
    version: |-
      crun version 1.8.5
      commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
      rundir: /var/run/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 8589406208
  swapTotal: 8589930496
  uptime: 92h 29m 14.00s (Approximately 3.83 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
    PullFromMirror: ""
  search:
  - docker.io
  - docker.pkg.github.com
  - quay.io
  - public.ecr.aws
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 4
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /storage/containers/podman/storage
  graphRootAllocated: 67049664512
  graphRootUsed: 17296240640
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /storage/containers/podman/tmp
  imageStore:
    number: 238
  runRoot: /storage/containers/podman/run.tmp
  transientStore: false
  volumePath: /storage/containers/podman/volumes
version:
  APIVersion: 4.6.1
  Built: 1692036008
  BuiltTime: Mon Aug 14 19:00:08 2023
  GitCommit: f3069b3ff48e30373c33b3f5976f15abf8cfee20
  GoVersion: go1.20.5
  Os: linux
  OsArch: linux/arm64
  Version: 4.6.1

Concurrent containers:

$ sudo podman ps -a
CONTAINER ID  IMAGE                                          COMMAND               CREATED     STATUS      PORTS       NAMES
af3fa1acec15  localhost/service.app-admin.syslog-ng:4.2.0    --cfgfile /etc/sy...  2 days ago  Up 2 days               openrc-syslog-ng-4.2.0
93c3ee4b92c5  localhost/service.net-dns.bind:9.16.41         -u named -f           2 days ago  Up 2 days               openrc-bind-9.16.41
004d89162a52  localhost/service.net-misc.openntpd:6.2_p3-r2  -d -f /etc/opennt...  2 days ago  Up 2 days               openrc-openntpd-6.2_p3-r2
8c8cc3dce354  localhost/service.net-misc.dhcp:4.4.3_p1-r4    -cf /etc/dhcp/dhc...  2 days ago  Up 2 days               openrc-dhcp-4.4.3_p1-r4

… which is interesting, as on the original system the problem went away when temporarily changing to the overlay driver - which suggests that the problem is not with the btrfs (or overlay) driver, but nevertheless using a pristine storage directory makes it go away.

Also, both bolted and sqlite database backends are affected.

Given this, could there be any link to total image size within podman storage and this delay?

Original system:

sudo podman info --debug | grep -F podman | awk '{ print $2 }' | xargs -rn 1 sudo du -hxs
1.5G    /mnt/podman-storage
113M    /space/podman/volumes

New amd64 system:

$ sudo podman info --debug | grep -F '/storage/' | awk '{ print $2 }' | xargs -rn 1 sudo du -hxs
16G     /storage/containers/podman/storage
8.0K    /storage/containers/podman/tmp
120K    /storage/containers/podman/run
42M     /storage/containers/podman/volumes

arm64 system:

16G     /storage/containers/podman/storage
12K     /storage/containers/podman/tmp
72K     /storage/containers/podman/run.tmp
4.0K    /storage/containers/podman/volumes
vrothberg commented 1 year ago

I am absolutely incapable of reproducing. Can you run /usr/share/bcc/tools/execsnoop in parallel and share which processes are being run? Does it also happen without --rm? If so, can you after that share podman ps --all?

srcshelton commented 1 year ago
$ sudo podman run --net none docker.io/library/alpine:latest 'date'; date
Mon Sep 18 14:14:54 UTC 2023
Mon 18 Sep 15:15:15 BST 2023

Output of podman ps -a:

$ sudo podman ps -a | grep 'alpine.*date'
61eef5c90f21  docker.io/library/alpine:latest  date  38 seconds ago  Exited (0) 39 seconds ago  upbeat_swanson

There's no significant delay removing this Exited container:

$ time sudo podman rm upbeat_swanson
upbeat_swanson

real    0m0.216s
user    0m0.002s
sys     0m0.003s
$ sudo podman run --net none docker.io/library/alpine:latest 'exit'
Error: crun: executable file `exit` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found

… although in this case, the container persists with state Created. If I then remove this Created container, the result is also almost instant:

$ time sudo podman rm 1364702f0f2a
1364702f0f2a

real    0m0.063s
user    0m0.003s
sys     0m0.002s

I'm not familiar with 'execsnoop', but I'll take a look now...

vrothberg commented 1 year ago

Thanks for sharing!

I'm not familiar with 'execsnoop', but I'll take a look now...

It's part of bcc-tools and a veeeeery helpful script.

srcshelton commented 1 year ago

I feel I'm making progress - the bcc utilities require LLVM/clang which is quite chunky, so I thought I'd try running it within a container.

With a default/unconfigured installation, as expected, podman within a container exists instantly.

However, if I do nothing more than import my configuration, change the driver to overlay, and create the referenced directories, then the problem again returns.

… so the problem only appears with some non-default setting in /etc/containers/ which I seem to have set.

I'll keep digging...

srcshelton commented 1 year ago

… it's something in containers.conf, which I have as follows:

[containers]
init_path = "/usr/libexec/podman/catatonit"
[network]
network_backend = "netavark"
default_subnet="172.18.0.0/16"
[engine]
cgroup_manager = "cgroupfs"
conmon_path = [
  "/usr/libexec/podman/conmon",
  "/usr/bin/conmon"
]
database_backend="sqlite"
env = ["TMPDIR=/space/podman/tmp"]
events_logger = "file"
events_logfile_path = "/var/log/podman/events.log"
events_logfile_max_size = 10485760
infra_command = "/usr/bin/pause"
infra_image = "gcr.io/google_containers/pause-amd64:3.2"
remote_uri= "unix://var/run/podman/podman.sock"
static_dir = "/space/podman/static"
tmp_dir = "/var/run/libpod"
volume_path = "/space/podman/volumes"
runtime = "crun"
runtime_supports_json = ["crun", "runc"]
runtime_supports_nocgroups = ["crun"]
runtime_supports_kvm = []
image_copy_tmp_dir = "/space/podman/tmp"
[engine.runtimes]
runc = [
  "/usr/bin/runc"
]
crun = [
  "/usr/bin/crun"
]
kata = []
srcshelton commented 1 year ago

Urgh - I've found the specific issue:

… which I suspect means that some component has hard-coded the default (/var/lib/containers/storage/libpod?) and is breaking if the expected content isn't there?

Rough reproducer:

In a fresh container, install podman and then run:

[ -s /etc/containers/policy.json.example ] && [ ! -s /etc/containers/policy.json ] && cp /etc/containers/policy.json.example /etc/containers/policy.json
podman stop -a ;
podman rm -v -a ;
podman image rm -f -a ;
podman system reset -f ;
[ -s /etc/containers/containers.conf ] && rm -r $( sed 's/#.*$//' /etc/containers/containers.conf | grep -o '"/[^"]\+"' | grep -v -e 'bin/' -e 'libexec/' -e '\.log"' ) ;
podman system reset -f ;
[ -s /etc/containers/containers.conf ] && mkdir -p $( sed 's/#.*$//' /etc/containers/containers.conf | grep -o '"/[^"]\+"' | grep -v -e 'bin/' -e 'libexec/' -e '\.log"' ) ;
podman image pull docker.io/library/alpine:latest ;
podman run --net none docker.io/library/alpine:latest 'date' ; date

… then create /etc/containers/containers.conf containing:

[engine]
static_dir = "/tmp/podman/static"

… and re-run the above: the path used for static_dir doesn't matter: if it's missing, containers exit instantly - if it's present, there's a 20 second timeout.

(This is with crun, I'm not sure whether this is related. I've not tried with runc installed)

srcshelton commented 1 year ago

Furthermore, the specified static_dir contains only a bolt_state.db which hasn't changed since May.

Is appears that the actual bolt_state.db is now in (graph root)/libpod regardless of the static_dir setting.

Is static_dir now vestigial?

Interestingly, a system reset may be needed to fix this, as commenting static_dir on the host system (with running containers) and trying to run date in Alpine still gives a 20s delay.


I also have:

database_backend="sqlite"

… in /etc/containers/containers.conf, and yet I see /var/lib/containers/cache (is this a hard-coded path which can't be configured?) contains:

# ls -l cache/
total 74096
-rw------- 1 root root 92598272 Sep 18 10:41 blob-info-cache-v1.boltdb
-rw-r--r-- 1 root root    32768 Sep 16 15:27 blob-info-cache-v1.sqlite

… why do I have a boltdb file which is more up to date (and much larger) than the sqlite one - I thought sqlite was a complete replacement?

srcshelton commented 1 year ago

Update: Removing static_dir from containers.conf isn't enough to resolve the issue on a non-fresh system: even after a reboot, the 20s delay persists.

(Although running with --transient-store also prevents the delay from occurring)

vrothberg commented 1 year ago
  • If static_dir is set in /etc/containers/containers.conf, then there's a 20-second delay on container exit which goes away once the setting is removed.

I can reproduce as well with this setting. Great work tracking this down, @srcshelton! As to why this happens needs some investigation.

… why do I have a boltdb file which is more up to date (and much larger) than the sqlite one - I thought sqlite was a complete replacement?

I assume you changed the db_backend, so the previous one will continue lingering around.

vrothberg commented 1 year ago

I made the observation that the cleanup process does not see (or read) the containers.conf file correctly. I configured mine in the $HOME directory and noticed that podman container cleanup doesn't pick up the configured value. I'll continue digging.

srcshelton commented 1 year ago

I can raise a new bug report for this, but I suspect it might be the same root-cause:

Even with database_backend="sqlite" in /etc/containers/containers.conf, I'm still seeing /var/lib/containers/cache/blob-info-cache-v1.boltdb created and used, and /var/lib/containers/cache/blob-info-cache-v1.sqlite unchanged in the past 3 days, despite several reboots and container restarts.

vrothberg commented 1 year ago

Yes, there is something really strange going on. container cleanup doesn't seem to pick up the correct values from containers.conf. I'll be buried in meetings today but will take a look.

Cc: @mheon

vrothberg commented 1 year ago

Even with database_backend="sqlite" in /etc/containers/containers.conf, I'm still seeing /var/lib/containers/cache/blob-info-cache-v1.boltdb

This is expected. The blob-info-cache is something the image library was using.