containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.21k stars 2.37k forks source link

Container completely hangs on podman 1.6+ #21323

Open MarwanTukhta opened 8 months ago

MarwanTukhta commented 8 months ago

Issue Description

we use podman compose command for our local environments, on MAC (ARM and x86) there is an issue where a rails 6 app container stop working, it just reaches a certain point and hangs indefinitely, I've noticed that the container's CPU goes crazy during this time, it reaches 400% utilization, adding more resources to the podman machine doesn't help.

This only happens on 1.6 versions, 1.5.3 works perfectly

Steps to reproduce the issue

Steps to reproduce the issue

  1. have 1.6+ podman version
  2. run podman compose that creates db and redis and rails app containers (the app is huge)
  3. this happens on all the laptops that we tried

Describe the results you received

app container hangs indefinitely after running for a while (before reaching the starting app point)

Describe the results you expected

the app container to work as it did with the older podman version

podman info output

host:
  arch: arm64
  buildahVersion: 1.33.2
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.8-2.fc39.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 96.53
    systemPercent: 1.49
    userPercent: 1.98
  cpus: 1
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: coreos
    version: "39"
  eventLogger: journald
  freeLocks: 2042
  hostname: localhost.localdomain
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.6.9-200.fc39.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 86818816
  memTotal: 1834299392
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.9.0-1.fc39.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.9.0
    package: netavark-1.9.0-1.fc39.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.9.0
  ociRuntime:
    name: crun
    package: crun-1.12-1.fc39.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.12
      commit: ce429cb2e277d001c2179df1ac66a470f00802ae
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231204.gb86afe3-1.fc39.aarch64
    version: |
      pasta 0^20231204.gb86afe3-1.fc39.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc39.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 16m 29.00s
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 2
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 99252940800
  graphRootUsed: 16395927552
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 127
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.8.3
  Built: 1704291040
  BuiltTime: Wed Jan  3 17:10:40 2024
  GitCommit: ""
  GoVersion: go1.21.5
  Os: linux
  OsArch: linux/arm64
  Version: 4.8.3

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

mheon commented 8 months ago

Do you know what is using the CPU when you see the spikes?

MarwanTukhta commented 8 months ago

Do you know what is using the CPU when you see the spikes?

when I ran htop to check whats using the CPU, turned out they don't match, htop will show ~20% CPU utilization total, the top one being from processes like running a db migration (which is stuck) and podman desktop will show +100% CPU utilization

777GE90 commented 8 months ago

I am running Mac OS Sonoma 14.2.1 (ARM, M1 processor) and am experiencing the same (or very similar) problem.

I am running a RabbitMQ container and after some hours or overnight, podman just stops working properly. For example if I do docker ps, the command just hangs indefinitely. If I go to my web browser and try to access the RabbitMQ UI, the page just infinitely loads but never does finish loading.

I have python workers which connect to rabbitMQ and they just lose connection and stop running.

Going into the Podman GUI and clicking on the "containers" view, sometimes will show no containers, even though I have some running.

The only way to fix it is by doing a podman machine stop and start, then starting the rabbit container again.

Interestingly, I am running Podman version 1.5.3 which the creator says works perfectly, but it certainly doesn't for me.

MarwanTukhta commented 8 months ago

I am running Mac OS Sonoma 14.2.1 (ARM, M1 processor) and am experiencing the same (or very similar) problem.

I am running a RabbitMQ container and after some hours or overnight, podman just stops working properly. For example if I do docker ps, the command just hangs indefinitely. If I go to my web browser and try to access the RabbitMQ UI, the page just infinitely loads but never does finish loading.

I have python workers which connect to rabbitMQ and they just lose connection and stop running.

Going into the Podman GUI and clicking on the "containers" view, sometimes will show no containers, even though I have some running.

The only way to fix it is by doing a podman machine stop and start, then starting the rabbit container again.

Interestingly, I am running Podman version 1.5.3 which the creator says works perfectly, but it certainly doesn't for me.

Yes, mines also loads forever on the browser, I also get the GUI bug sometimes but its not a big deal.

arixmkii commented 8 months ago

Interestingly, I am running Podman version 1.5.3 which the creator says works perfectly, but it certainly doesn't for me.

This is the Podman-Desktop version, it only reflects the GUI App version in use. One need to inspect the version of underlying Podman and Podman machine, which is separate from Podman-Desktop.

777GE90 commented 8 months ago

I've upgraded my podman machine to 4.9.0 and the podman desktop to 1.6.4 as well now. Will see if it makes any difference.

MarwanTukhta commented 7 months ago

I've upgraded my podman machine to 4.9.0 and the podman desktop to 1.6.4 as well now. Will see if it makes any difference.

any news?

777GE90 commented 7 months ago

I've upgraded my podman machine to 4.9.0 and the podman desktop to 1.6.4 as well now. Will see if it makes any difference.

any news?

Well I've upgraded and not noticed the same problem since. However, I wanted to give it a bit more time just to confirm, but so far so good.

777GE90 commented 7 months ago

I think I spoke too soon, same issue still occurs. It does seem to happen less frequently though.

777GE90 commented 5 months ago

I've decided to ditch Podman for now and moved back to Colima, it's far too unstable to be considered a production ready app.

giuseppe commented 3 months ago

@mheon should we move this issue to podman-desktop?