Connection refused after container restart (but no problem avec system restart)

tholeb commented 1 year ago

Issue Description

I'm running multiple containers (started with ansible). Creating them works fine, I can restart them, ... But, since a few weeks, I'm experiencing something very strange. When I (re)start the Raspberry Pi, all containers are working well, no errors/warnings in logs. However, when I restart the container itself, I get a 502 bad gateway from nginx (not a container) and a simple "connection refused" using curl, even thought there are no errors/warnings in logs. When I'm in the container, It has internet, and it can curl other containers using the gateway (10.88.0.1) but not localhost(127.0.0.1 and ::1) for some reason.

Steps to reproduce the issue

I don't think, it's not related to the container (2 out of 8 of my containers are affected by this problem) but here I to reproduce this

Steps to reproduce the issue

pull and start https://github.com/bastienwirtz/homer
Restart the container
try to curl it

Describe the results you received

As explained above, I get a 502 by nginx, and a connection refused when I curl the affected container(s).

Describe the results you expected

When I restart the container, I expect to be able to access it.

podman info output

host:
  arch: arm64
  buildahVersion: 1.23.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpus: 4
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: journald
  hostname: raspberry
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.0-1027-raspi
  linkmode: dynamic
  logDriver: journald
  memFree: 6287241216
  memTotal: 8186318848
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: true
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 0
  swapTotal: 0
  uptime: 1h 48m 26.02s (Approximately 0.04 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 8
    paused: 0
    running: 8
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 8
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.17.3
  OsArch: linux/arm64
  Version: 3.4.4

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

No

Additional environment details

System : Raspberry Pi 4 8Gb with Ubuntu 22.04 LTS

Additional information

How I deploy the container using Ansible:

- name: Homer - Run container using podman
  containers.podman.podman_container:
      name: homer
      image: b4bz/homer:latest
      state: started
      recreate: true
      restart_policy: on-failure
      user: 0:0
      volumes:
          - "{{ homer_data }}:/www/assets:rw"
      ports: "8080:8080"
      memory: "512m"
      generate_systemd:
          path: /etc/systemd/system
          new: true
  notify:
      - Systemd daemon reload
      - Restart homer

Luap99 commented 1 year ago

I take a guess here that this is fixed in v4.0, please try it with a newer version. You can check if the (rootless-netns) slirp4netns process is still running, systemd will kill it when the unit that spawned it is stopped. At least that was the bug I fixed a while ago: https://github.com/Luap99/libpod/commit/8d0fb0a4ed80eabf02b82c22d4d2b637d6a84da4

tholeb commented 1 year ago

So I downloaded the newest version of podman, but I still have the problem. Restarting my Pi starts the container without any problems, but restarting the container itself causes a 502/connection refused.

Also, it appears that there is no rootless-netns nor slirp4netns process running whatsoever. I tried a ps after container and system restart and I saw nothing.

root@raspberry:~# podman info
host:
  arch: arm64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon_2:2.1.7-0ubuntu22.04+obs15.20_arm64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 90.39
    systemPercent: 5.88
    userPercent: 3.73
  cpus: 4
  databaseBackend: boltdb
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: journald
  hostname: raspberry
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.0-1027-raspi
  linkmode: dynamic
  logDriver: journald
  memFree: 6431625216
  memTotal: 8186318848
  networkBackend: cni
  ociRuntime:
    name: crun
    package: crun_101:1.8.4-0ubuntu22.04+obs55.6_arm64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.4
      commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.0.1-2_arm64
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 0
  swapTotal: 0
  uptime: 0h 10m 58.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 9
    paused: 0
    running: 9
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 984236957696
  graphRootUsed: 11777257472
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 15
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.5.0
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.18.1
  Os: linux
  OsArch: linux/arm64
  Version: 4.5.0

Luap99 commented 1 year ago

Sorry missed the fact that you run as root. Thanks for checking with the latest version, I will reopen this. Looks like you run the containers via systemd how do you restart them?

tholeb commented 1 year ago

I restart them with systemd or via cockpit (web UI) using the podman plugin.

I tried restarting them using podman restart but the issue persists.

Luap99 commented 1 year ago

If the unit is run in system I would expect that you have to to systemctl restart <name>, I would expect a restart outside the unit to not work properly but frankly;y I have never tried.

tholeb commented 1 year ago

Well, I tried every restart method I have (webUI, podman, systemd), and the issue persists. I'll revert to 3.4.4.

tholeb commented 1 year ago

After a complete reinstallation of my pi, the problem comes from pivpn. Not sure why, but I'll close the issue. Thanks for the help

Luap99 commented 1 year ago

@tholeb I have no idea what pivpn does under the hood but if it flushes firewall rules you might want to run podman network reload --all to restore the podman firewall rules.

containers / podman