containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.27k stars 2.37k forks source link

container restart fails when using pasta + port publish #23737

Open jpalus opened 1 month ago

jpalus commented 1 month ago

Issue Description

podman restart <container> fails with:

WARN[0010] StopSignal SIGTERM failed to stop container test in 10 seconds, resorting to SIGKILL 
Error: pasta failed with exit code 1:
Failed to bind port 2222 (Address already in use) for option '-t 2222-2222:22-22', exiting

when using pasta with published port. Works fine with slirp4netns though

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create container:
    podman create --name test --network pasta --publish 2222:22 docker.io/fedora sleep 60
  2. Start container:
    podman start test
  3. Restart container:
    podman restart test

Describe the results you received

Restart fails:

Error: pasta failed with exit code 1:
Failed to bind port 2222 (Address already in use) for option '-t 2222-2222:22-22', exiting

Describe the results you expected

Restart succeeds.

podman info output

host:
  arch: arm64
  buildahVersion: 1.37.2
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: 21bbf65b0425b568f030373dafc0840e1d16c4fc'
  cpuUtilization:
    idlePercent: 93.72
    systemPercent: 2.29
    userPercent: 3.99
  cpus: 6
  databaseBackend: sqlite
  distribution:
    distribution: pld
    version: "3.0"
  eventLogger: journald
  freeLocks: 2047
  hostname: pine
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.10.5-1
  linkmode: dynamic
  logDriver: journald
  memFree: 295067648
  memTotal: 4045975552
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: Unknown
    package: Unknown
    path: /usr/libexec/podman/netavark
    version: netavark 1.11.0
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/bin/crun
    version: |-
      crun version 1.16.1
      commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: Unknown
    version: |
      pasta 2024_08_21.1d6142f
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.3.1
      commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 1004867584
  swapTotal: 2022699008
  uptime: 42h 17m 33.00s (Approximately 1.75 days)
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5000
    PullFromMirror: ""
store:
  configFile: /home/users/jan/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/users/jan/.local/share/containers/storage
  graphRootAllocated: 60686266368
  graphRootUsed: 44395888640
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /home/users/jan/tmp
  imageStore:
    number: 7
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/users/jan/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1724271783
  BuiltTime: Wed Aug 21 22:23:03 2024
  GitCommit: ""
  GoVersion: go1.23.0
  Os: linux
  OsArch: linux/arm64
  Version: 5.2.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

Luap99 commented 1 month ago

Did you run out of inotify instances for you user by chance? Can you check if you see ...use a timer in your journal written by pasta?

Fundamentally the netns quit code is racy, not just for pasta but slirp4netns and rootlessport as well... For pasta it exits when we unmount the netns path (either via inotify or 1s poll interval), so when ports are bound and we restart we may launch pasta before the old pasta exited. The use of the timer make it of course much easier to reproduce...

With slirp4netns and rootlessport it is better because we send a pipe down into conmon and once conmon closes the pipe they will exit. The close of the pipe happens directly after the container died so chances of a race are much more unlikely as the window until the restart is much longer compared to the step where we unmount the netns. However it still is racy.

Overall I think the right fix would be to track the pid of the processes and SIGKILL them in container teardown. But SIGKILL is not synchronous either so we still would wait for the exit somehow. We have problems of potential pid reuse, when the network proces exits podman cannot notice this as we are not a deamon, then another process might get the same pid and we start killing the wrong process. We do have such problems elsewhere so it is already an accepted risk I would say.

github-actions[bot] commented 3 days ago

A friendly reminder that this issue had no activity for 30 days.