containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.57k stars 2.4k forks source link

Podman networking errors when switching between network adapters #24341

Open p5 opened 19 hours ago

p5 commented 19 hours ago

Issue Description

Very strange one. When I am on my system, and I switch between WiFi and Ethernet adapters, Podman occasionally refuses to connect to any networks. This happens with both rootful and rootless containers, and also causes errors during builds.

In the below video, you can see:

  1. networking fails on Ethernet
  2. networing works on WiFi
  3. networking works on WiFi + Ethernet
  4. networking fails on Ethernet
  5. networking works on Ethernet (after toggle)

https://github.com/user-attachments/assets/c6ff43ad-8efe-4ae9-b8c6-cfdfdf89425b

Now, this could be a Linux bug. But I thought I'd start here and see what you all think of it. I'm hoping this is easily reproducible since I could not find any relevant logs.

I would understand (and possibly expect) an error if the same container failed to seamlessly switch between NICs, but this is happening when I create completely different containers.

Steps to reproduce the issue

Steps to reproduce the issue

  1. run a ping inside a container while connected to a single network
  2. enable another network adapter and run the ping command again
  3. disable the original network adapter and run the ping command once more
  4. notice the ping command failing, though the host has network access

Describe the results you received

See above video

Describe the results you expected

Podman to figure out the right network to use when spinning up a new container on a different network interface.

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-3.fc41.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 98.59
    systemPercent: 0.79
    userPercent: 0.61
  cpus: 24
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: silverblue
    version: "41"
  eventLogger: journald
  freeLocks: 2047
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.11.3-300.fc41.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 19247226880
  memTotal: 33362939904
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-2.fc41.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.fc41.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.fc41.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240906.g6b38f07-1.fc41.x86_64
    version: |
      pasta 0^20240906.g6b38f07-1.fc41.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-3.fc41.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 0h 31m 39.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  ghcr.io/rsturla:
    Blocked: false
    Insecure: false
    Location: ghcr.io/rsturla
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: true
      Location: localhost:5000/rsturla
      PullFromMirror: ""
    Prefix: ghcr.io/rsturla
    PullFromMirror: ""
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5000
    PullFromMirror: ""
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /var/home/admin/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/admin/.local/share/containers/storage
  graphRootAllocated: 1998678130688
  graphRootUsed: 765224169472
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 26
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /var/home/admin/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.5
  Built: 1729209600
  BuiltTime: Fri Oct 18 01:00:00 2024
  GitCommit: ""
  GoVersion: go1.23.2
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.5

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

I am able to reproduce this roughly 1 in 3 times I switch from two enabled NICs to one. If I wait for a couple minutes after switching from two to one adapter, the same still happens, so it doesn't seem like anything is happening in the background.

Luap99 commented 9 hours ago

You say root and rootless but your video shows rootless only and podman is not failing pasta is and we do not use pasta as root at all so I do not see how root would fail at all. In particular root doesn't care about the external interfaces with its bridge + NAT (MASQUERADE) setup which should always run fine.

The pasta error means it failed to copy your host routes into the container so please provide the full routes when this happens ip route because this is bug. Because pasta fails if there is no external interface not found but it should generally not fail adding routes.

ps: please copy pasta the error message so it can be indexed for the search

p5 commented 9 hours ago

I did have issues with rootful networking when I tried this out (both being "Network Unreachable" errors), but you must be right - it's probably some other transient issue. I must have misunderstood them as the same issue.

When trying to reproduce today, rootful works fine whereas I am still able to replicate with rootless.

The error depends on what I am trying to do in the container.
Most of the time, it's Network is unreachable.

And digging around in the debug Podman logs, I can see the following, though I am unsure whether the conmon error is related:

DEBU[0000] pasta arguments: --config-net --dns-forward 169.254.0.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-b1acbc49-50c0-7337-c86b-e4a263692f5c 
INFO[0000] pasta logged warnings: "Couldn't get any nameserver address\n"
---
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

IP routes:

❯ ip route                                 
default via 192.168.0.1 dev eno1 proto dhcp src 192.168.0.177 metric 100 
default via 192.168.0.1 dev wlp9s0 proto dhcp src 192.168.0.26 metric 600 
192.168.0.0/24 dev wlp9s0 proto kernel scope link src 192.168.0.26 metric 600 

Full log:

``` ❯ podman run --log-level debug --rm busybox nslookup github.com INFO[0000] podman filtering at log level debug DEBU[0000] Called run.PersistentPreRunE(podman run --log-level debug --rm busybox nslookup github.com) DEBU[0000] Using conmon: "/usr/bin/conmon" INFO[0000] Using sqlite as database backend DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /var/home/admin/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000/containers DEBU[0000] Using static dir /var/home/admin/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /var/home/admin/.local/share/containers/storage/volumes DEBU[0000] Using transient store: false DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that metacopy is not being used DEBU[0000] Cached value indicated that native-diff is usable DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument DEBU[0000] using runtime "crun-vm" from $PATH: "/usr/bin/crun-vm" DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument DEBU[0000] Using OCI runtime "/usr/bin/crun" INFO[0000] Setting parallel job count to 73 DEBU[0000] Pulling image busybox (policy: missing) DEBU[0000] Looking up image "busybox" in local containers storage DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] } DEBU[0000] Loading registries configuration "/etc/containers/registries.conf" DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/000-shortnames.conf" DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/001-zot-quadlet-mirror.conf" DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/002-zot-quadlet.conf" DEBU[0000] Trying "docker.io/library/busybox:latest" ... DEBU[0000] parsed reference into "[overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce) DEBU[0000] exporting opaque data as blob "sha256:27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Looking up image "docker.io/library/busybox:latest" in local containers storage DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] } DEBU[0000] Trying "docker.io/library/busybox:latest" ... DEBU[0000] parsed reference into "[overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Found image "docker.io/library/busybox:latest" as "docker.io/library/busybox:latest" in local containers storage DEBU[0000] Found image "docker.io/library/busybox:latest" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce) DEBU[0000] exporting opaque data as blob "sha256:27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Looking up image "busybox" in local containers storage DEBU[0000] Normalized platform linux/amd64 to {amd64 linux [] } DEBU[0000] Trying "docker.io/library/busybox:latest" ... DEBU[0000] parsed reference into "[overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/var/home/admin/.local/share/containers/storage+/run/user/1000/containers]@27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce) DEBU[0000] exporting opaque data as blob "sha256:27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Inspecting image 27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce DEBU[0000] exporting opaque data as blob "sha256:27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Inspecting image 27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce DEBU[0000] Inspecting image 27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce DEBU[0000] Inspecting image 27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce DEBU[0000] using systemd mode: false DEBU[0000] No hostname set; container's hostname will default to runtime default DEBU[0000] Loading seccomp profile from "/usr/share/containers/seccomp.json" DEBU[0000] Allocated lock 1 for container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e DEBU[0000] exporting opaque data as blob "sha256:27a71e19c95622dce4d60d4a3760707495c9875f5c5322c5bd535214799593ce" DEBU[0000] Cached value indicated that idmapped mounts for overlay are not supported DEBU[0000] Check for idmapped mounts support DEBU[0000] Created container "0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e" DEBU[0000] Container "0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e" has work directory "/var/home/admin/.local/share/containers/storage/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata" DEBU[0000] Container "0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e" has run directory "/run/user/1000/containers/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata" DEBU[0000] Not attaching to stdin INFO[0000] Received shutdown.Stop(), terminating! PID=45459 DEBU[0000] Enabling signal proxying DEBU[0000] Cached value indicated that volatile is being used DEBU[0000] overlay: mount_data=lowerdir=/var/home/admin/.local/share/containers/storage/overlay/l/W4M73UWI4IJ2ELG5YFEF75N44P,upperdir=/var/home/admin/.local/share/containers/storage/overlay/0a11f97a789d68d0057c7f4ecf0d0159972971d5b442bc66f73910cd0d9c93c4/diff,workdir=/var/home/admin/.local/share/containers/storage/overlay/0a11f97a789d68d0057c7f4ecf0d0159972971d5b442bc66f73910cd0d9c93c4/work,userxattr,volatile,context="system_u:object_r:container_file_t:s0:c215,c276" DEBU[0000] Made network namespace at /run/user/1000/netns/netns-0cc8e005-ae3f-9b9e-7761-9f180db3202b for container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e DEBU[0000] pasta arguments: --config-net --dns-forward 169.254.0.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-0cc8e005-ae3f-9b9e-7761-9f180db3202b DEBU[0000] Mounted container "0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e" at "/var/home/admin/.local/share/containers/storage/overlay/0a11f97a789d68d0057c7f4ecf0d0159972971d5b442bc66f73910cd0d9c93c4/merged" DEBU[0000] Created root filesystem for container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e at /var/home/admin/.local/share/containers/storage/overlay/0a11f97a789d68d0057c7f4ecf0d0159972971d5b442bc66f73910cd0d9c93c4/merged INFO[0000] pasta logged warnings: "Couldn't get any nameserver address\n" DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription DEBU[0000] Setting Cgroups for container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e to user.slice:libpod:0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d DEBU[0000] Workdir "/" resolved to host path "/var/home/admin/.local/share/containers/storage/overlay/0a11f97a789d68d0057c7f4ecf0d0159972971d5b442bc66f73910cd0d9c93c4/merged" DEBU[0000] Created OCI spec for container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e at /var/home/admin/.local/share/containers/storage/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata/config.json DEBU[0000] /usr/bin/conmon messages will be logged to syslog DEBU[0000] running conmon: /usr/bin/conmon args="[--api-version 1 -c 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e -u 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e -r /usr/bin/crun -b /var/home/admin/.local/share/containers/storage/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata -p /run/user/1000/containers/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata/pidfile -n interesting_faraday --exit-dir /run/user/1000/libpod/tmp/exits --persist-dir /run/user/1000/libpod/tmp/persist/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e --full-attach -s -l journald --log-level debug --syslog --conmon-pidfile /run/user/1000/containers/overlay-containers/0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/admin/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/home/admin/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e]" [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied DEBU[0000] Received: 45489 INFO[0000] Got Conmon PID as 45487 DEBU[0000] Created container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e in OCI runtime DEBU[0000] found local resolver, using "/run/systemd/resolve/resolv.conf" to get the nameservers DEBU[0000] Attaching to container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e DEBU[0000] Starting container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e with command [nslookup github.com] DEBU[0000] Started container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e DEBU[0000] Notify sent successfully nslookup: can't connect to remote host (169.254.0.1): Network is unreachable DEBU[0000] Checking if container 0995b27232903d75461a64c9c23a2e4e762e223e25297f8a78f63a334ff5f17e should restart DEBU[0000] Called run.PersistentPostRunE(podman run --log-level debug --rm busybox nslookup github.com) DEBU[0000] Shutting down engines ```