containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.83k stars 2.42k forks source link

dns issues: aardvark-dns namespace networks possibly misconfigured or failing to configure, no visible errors #20396

Closed deliciouslytyped closed 11 months ago

deliciouslytyped commented 1 year ago

Issue Description

Sorry for the vague title, It can be made specific if we can figure out something more concrete. I spend a lot of time in the past few weeks fiddling with various podman problems and gotten a bit tired and burned out on it. It would be nice to have some guidance on how to go forward from here. Sorry for the low quality issue filing, if I have some more energy later I will improve it.

I last tried this with podman 4.5.0 .

I've thrown nsenter, pwru, some iptables tracing tools, looking at the debug output of podman at this and figured some things out.

I have a fairly vanilla rootless podman-compose configuration along with a network configured for it, and I'm having DNS issues with it: the containers cannot reach aardvark-dns, the port is not there. DNS is enabled on the network.

Basically I have confirmed that the reason that DNS replies are not received is because the packet are either going to the wrong place, or aardvark isnt listening on an interface that it should be listening on. aardvark itself works, because I was able to enter it's namespace and talk to it.

I managed to figure out a workaround which consists of bringing up the containers, killing aardark-dns and then restarting any container to bring aardvark-dns back up.

Using one of (I don't currently remember, I will test it later I guess) podman unshare --rootless-netns or using nsenter to attach to the network namespace of the aardvark process, or something like that, shows that the interfaces used to wire up podman's networking are missing until I kill and start aardvark.

Is there any way to debug aardvark bringup specifically?

### Steps to reproduce the issue TODO ### Describe the results you received Describe the results you received ### Describe the results you expected Describe the results you expected ### Podman in a container No ### Privileged Or Rootless Rootless ### Upstream Latest Release No ### Additional environment details Additional environment details ### Additional information Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

podman info output

host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /nix/store/1y223352pm0laijhsn7lca66qbnrg78h-conmon-2.1.7/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 82.84
    systemPercent: 11.01
    userPercent: 6.15
  cpus: 3
  databaseBackend: boltdb
  distribution:
    codename: stoat
    distribution: nixos
    version: "23.05"
  eventLogger: journald
  hostname: nixos
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.1.52
  linkmode: dynamic
  logDriver: journald
  memFree: 114069504
  memTotal: 4021714944
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: Unknown
    path: /nix/store/54sfyqak7sn57z9gdhx4dydj7564lh6b-crun-1.8.4/bin/crun
    version: |-
      crun version 1.8.4
      commit: 1.8.4
      rundir: /run/user/0/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /nix/store/czkc4zy7f213q8a0la8r59yranl7yw8q-slirp4netns-1.2.0/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 5541924864
  swapTotal: 10737414144
  uptime: 118h 52m 35.00s (Approximately 4.92 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  docker.io/library:
    Blocked: false
    Insecure: true
    Location: localhost:5001
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: docker.io/library
    PullFromMirror: ""
  localhost:5001:
    Blocked: false
    Insecure: true
    Location: localhost:5001
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5001
    PullFromMirror: ""
  search:
  - localhost:5001
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 0
    stopped: 4
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 80307429376
  graphRootUsed: 42895224832
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /run/user/0
  imageStore:
    number: 4
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.5.0
  Built: 315532800
  BuiltTime: Tue Jan  1 00:00:00 1980
  GitCommit: ""
  GoVersion: go1.20.7
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.0
Luap99 commented 1 year ago

Maybe lets take a step back first. And try to reproduce with something simple. Ideally reboot first to prevent any temporary state from conflicting.

After the start does the following work?

$ podman network create testnet
testnet
$ podman run --network testnet --name c1 -d  alpine sleep inf
dc517d9f188e51479e127776549d120fc117443e0920b1b74cd86da3e310b0de
$ podman run --network testnet --rm alpine nslookup c1
Server:     10.89.0.1
Address:    10.89.0.1:53

Non-authoritative answer:
Name:   c1.dns.podman
Address: 10.89.0.2

Non-authoritative answer:

$ podman run --network testnet --rm alpine nslookup google.com
Server:     10.89.0.1
Address:    10.89.0.1:53

Non-authoritative answer:
Name:   google.com
Address: 142.250.184.206

Non-authoritative answer:
Name:   google.com
Address: 2a00:1450:4001:812::200e

If you suspect namespace issues then you should check this:

$ podman unshare --rootless-netns readlink /proc/self/ns/net
net:[4026536046]
$ readlink /proc/$(pgrep aardvark-dns)/ns/net
net:[4026536046]

The values must match otherwise it would be totally expected that dns fails.

In general adding --log-level debug to all your podman commands should give much more clues what is going on during network setup. And aardvark-dns logs to the syslog so if you run systemd you should be able to see the message with journalctl --user SYSLOG_IDENTIFIER=aardvark-dns

deliciouslytyped commented 1 year ago

Thanks for engaging, sometimes it gives enough of a boost when someone asks. :)

I don't want to reset / reboot my system so we can hopefully actually figure out what's going on here, but it may be irrelevant.

  1. First: I needed to shut down all my containers to make sure aardvark goes down or gets deconfigured or whatever, otherwise it would keep working due to my previous workaround execution.

2.terminal 1:

$ podman network create testnet
testnet

$ podman run --rm --network testnet --name c1 quay.io/libpod/alpine sleep 1h
...

3.terminal 2:

$ podman run -it --rm --network testnet --rm quay.io/libpod/alpine /bin/sh
/ # nslookup c1
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'c1': Try again

/ # cat /etc/resolv.conf
search dns.podman
nameserver 10.89.2.1
options edns0
  1. terminal 3 (as you can see at the bottom, aardvark does not have interfaces):
    
    $ lsns -T -t net
    NS             TYPE  NPROCS     PID USER     NETNSID NSFS COMMAND
    4026531837     user       0         root
    ├─4026531840   net       81     904 nixos unassigned      /nix/store/q182qd78kvlijc25fas2v1bwn3gh670k-shell2http-1.16.0/bin/shell2http -include-stderr -show-errors -port 55555 -form /v1/msg└─4026532470   user       0         nixos
    ├─4026532471 net        1 1390375 nixos unassigned      sleep 1h
    ├─4026532537 net        2 3926339 nixos unassigned      /nix/store/kxkdrxvc3da2dpsgikn8s2ml97h88m46-bash-interactive-5.2-p15/bin/bash
    ├─4026532598 net        3  171765 nixos unassigned      /nix/store/kxkdrxvc3da2dpsgikn8s2ml97h88m46-bash-interactive-5.2-p15/bin/bash
    ├─4026532661 net        1 4065741 nixos unassigned      /nix/store/z56sdzjn2s70204pc66bspdp3xi21pfr-podman-4.5.0/libexec/podman/aardvark-dns --config /tmp/containers-user-1000/containers/  └─4026532797 net        1 1394610 nixos unassigned      /bin/sh

$ podman unshare --rootless-netns /bin/sh sh-5.2# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: tap0: <BROADCAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 6e:3f:b1:6a:cb:2a brd ff:ff:ff:ff:ff:ff 3: podman3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 4e:7f:07:a6:d7:c8 brd ff:ff:ff:ff:ff:ff 4: veth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman3 state UP mode DEFAULT group default qlen 1000 link/ether 7e:b4:61:d6:f5:67 brd ff:ff:ff:ff:ff:ff link-netnsid 0 12: veth1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman3 state UP mode DEFAULT group default qlen 1000 link/ether 4e:7f:07:a6:d7:c8 brd ff:ff:ff:ff:ff:ff link-netnsid 1

$ podman unshare

[root@nixos:~]# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 96:00:02:86:80:f6 brd ff:ff:ff:ff:ff:ff

nsenter -t $(pgrep aardvark-dns) -n ip link

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00


4. terminal 2 continued (executing workaround):

/ # exit

$ pkill -9 aardvark-dns

$ podman run -it --rm --network testnet --rm quay.io/libpod/alpine nslookup c1 nslookup: can't resolve '(null)': Name does not resolve

Name: c1 Address 1: 10.89.2.11 c21753cadc57


5. terminal 3 continued (as you can see, aardvark now has interfaces):

nsenter -t $(pgrep aardvark-dns) -n ip link

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: tap0: <BROADCAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 6e:3f:b1:6a:cb:2a brd ff:ff:ff:ff:ff:ff 3: podman3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 7e:b4:61:d6:f5:67 brd ff:ff:ff:ff:ff:ff 4: veth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman3 state UP mode DEFAULT group default qlen 1000 link/ether 7e:b4:61:d6:f5:67 brd ff:ff:ff:ff:ff:ff link-netnsid 0

deliciouslytyped commented 1 year ago

Ok I checked the systemd --user output that you suggested and it tells us probably the expected based on the above, but it's something! (spammed a decent amount)

Oct 18 14:31:49 nixos aardvark-dns[4065741]: Unable to start server unable to start CoreDns server: Cannot assign requested address (os error 99)
Luap99 commented 1 year ago

Ok your environment looks wrong then, somehow it mismatches namespaces which shouldn't be possible.

At the very least I see /tmp/containers-user-1000/containers/... which never looks right if you run on a systemd distro. This really should be /run/user/$UID.... You need a proper systemd login session if you run rootless. And then podman system reset to clean everything then podman info should report /run/user/$UID under runroot.

This isn't to say that this is a bug or helps here but in general I have seen so many systems with incorrect runroot were things simply break.

Also add --log-level debug to all the podman commands, there must be a Cleaning up rootless network namespace message somewhere if it removed the rootless-netns but that should never trigger if at least one container is still running.

deliciouslytyped commented 1 year ago

Yes I'm running a relatively minimal setup, I'm not familiar with all the details, but I may have "had" to run loginctl enable-linger myuser manually (I don't remember if that was in one of the nested contexts I was working on.). For whatever reason, $XDG_RUNTIME_DIR isn't set, which I assume is what points to /run/user/... .

Sometimes there do seem to be problems with taking down network namespaces. Given what you said, I expect it's related.

Also add --log-level debug to all the podman commands, there must be a Cleaning up rootless network namespace message somewhere if it removed the rootless-netns but that should never trigger if at least one container is still running.

So where are you suggesting I should run this? On the workaround restart command?

If nothing else comes out of this, if some additional diagnostics end up added to podman that can help with such situations, I think that would be a good outcome.

Luap99 commented 1 year ago

Also add --log-level debug to all the podman commands, there must be a Cleaning up rootless network namespace message somewhere if it removed the rootless-netns but that should never trigger if at least one container is still running.

So where are you suggesting I should run this? On the workaround restart command?

On every single podman command, you need to piece the full picture together. The most important one would be the first podman command which ends up creating aardvark-dns in the wrong namespace. But you really need to see where

Having a env where XDG_RUNTIME_DIR is set and not set can certainly confuse things.

deliciouslytyped commented 1 year ago

The reason I didn't post any debug logs is that I didn't find anything useful in them earlier. Nevertheless it's likely I missed something.

In the meantime, here is the network section of bringup of the first container after stopping all containers, I can give you more of the preceding logs if you need it.

My problem is that my very limited understanding is that aardvark basically runs in it's own container right? How can we see the aardvark container's bringup?

time="2023-10-18T16:07:08Z" level=debug msg="Made network namespace at /run/user/1000/netns/netns-6cc27332-91e6-e96d-3109-514fee50bda0 for container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601"
time="2023-10-18T16:07:08Z" level=debug msg="Cached value indicated that volatile is being used"
time="2023-10-18T16:07:08Z" level=debug msg="overlay: mount_data=lowerdir=/home/nixos/.local/share/containers/storage/overlay/l/E4NHQJKSZABVS6PNDXVQC2RUC5,upperdir=/home/nixos/.local/share/containers/storage/overlay/7aaa2baf0081f35a80c67ae3447d09eece1d46f235f7c554b1c973fef6afaea8/diff,workdir=/home/nixos/.local/share/containers/storage/overlay/7aaa2baf0081f35a80c67ae3447d09eece1d46f235f7c554b1c973fef6afaea8/work,,userxattr,volatile"
time="2023-10-18T16:07:08Z" level=debug msg="creating rootless network namespace with name \"rootless-netns-241895256a9b790b13e2\""
time="2023-10-18T16:07:08Z" level=debug msg="Mounted container \"03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601\" at \"/home/nixos/.local/share/containers/storage/overlay/7aaa2baf0081f35a80c67ae3447d09eece1d46f235f7c554b1c973fef6afaea8/merged\""
time="2023-10-18T16:07:08Z" level=debug msg="Created root filesystem for container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 at /home/nixos/.local/share/containers/storage/overlay/7aaa2baf0081f35a80c67ae3447d09eece1d46f235f7c554b1c973fef6afaea8/merged"
time="2023-10-18T16:07:08Z" level=debug msg="slirp4netns command: /nix/store/czkc4zy7f213q8a0la8r59yranl7yw8q-slirp4netns-1.2.0/bin/slirp4netns --disable-host-loopback --mtu=65520 --enable-sandbox --enable-seccomp --enable-ipv6 -c -r 3 --netns-type=path /run/user/1000/netns/rootless-netns-241895256a9b790b13e2 tap0"
time="2023-10-18T16:07:08Z" level=debug msg="The path of /etc/resolv.conf in the mount ns is \"/etc/resolv.conf\""
[DEBUG netavark::network::validation] "Validating network namespace..."
[DEBUG netavark::commands::setup] "Setting up..."
[INFO  netavark::firewall] Using iptables firewall driver
[DEBUG netavark::network::bridge] Setup network testnet
[DEBUG netavark::network::bridge] Container interface name: eth0 with IP addresses [10.89.2.27/24]
[DEBUG netavark::network::bridge] Bridge name: podman3 with IP addresses [10.89.2.1/24]
[DEBUG netavark::network::core_utils] Setting sysctl value for net.ipv4.ip_forward to 1
[DEBUG netavark::network::core_utils] Setting sysctl value for /proc/sys/net/ipv6/conf/eth0/autoconf to 0
[INFO  netavark::network::netlink] Adding route (dest: 0.0.0.0/0 ,gw: 10.89.2.1, metric 100)
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-C5964C6794A11 created on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK_FORWARD created on table filter
[DEBUG netavark::firewall::varktables::helpers] rule -d 10.89.2.0/24 -j ACCEPT created on table nat and chain NETAVARK-C5964C6794A11
[DEBUG netavark::firewall::varktables::helpers] rule ! -d 224.0.0.0/4 -j MASQUERADE created on table nat and chain NETAVARK-C5964C6794A11
[DEBUG netavark::firewall::varktables::helpers] rule -s 10.89.2.0/24 -j NETAVARK-C5964C6794A11 created on table nat and chain POSTROUTING
[DEBUG netavark::firewall::varktables::helpers] rule -d 10.89.2.0/24 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT created on table filter and chain NETAVARK_FORWARD
[DEBUG netavark::firewall::varktables::helpers] rule -s 10.89.2.0/24 -j ACCEPT created on table filter and chain NETAVARK_FORWARD
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-SETMARK created on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-MASQ created on table nat
[DEBUG netavark::firewall::varktables::helpers] chain NETAVARK-HOSTPORT-DNAT created on table nat
[DEBUG netavark::firewall::varktables::helpers] rule -j MARK  --set-xmark 0x2000/0x2000 created on table nat and chain NETAVARK-HOSTPORT-SETMARK
[DEBUG netavark::firewall::varktables::helpers] rule -j MASQUERADE -m comment --comment 'netavark portfw masq mark' -m mark --mark 0x2000/0x2000 created on table nat and chain NETAVARK-HOSTPORT-MASQ
[DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL created on table nat and chain PREROUTING
[DEBUG netavark::firewall::varktables::helpers] rule -j NETAVARK-HOSTPORT-DNAT -m addrtype --dst-type LOCAL created on table nat and chain OUTPUT
[DEBUG netavark::commands::setup] {
        "testnet": StatusBlock {
            dns_search_domains: Some(
                [
                    "dns.podman",
                ],
            ),
            dns_server_ips: Some(
                [
                    10.89.2.1,
                ],
            ),
            interfaces: Some(
                {
                    "eth0": NetInterface {
                        mac_address: "96:65:ee:64:b2:57",
                        subnets: Some(
                            [
                                NetAddress {
                                    gateway: Some(
                                        10.89.2.1,
                                    ),
                                    ipnet: 10.89.2.27/24,
                                },
                            ],
                        ),
                    },
                },
            ),
        },
    }
[DEBUG netavark::commands::setup] "Setup complete"
time="2023-10-18T16:07:08Z" level=debug msg="Adding nameserver(s) from network status of '[\"10.89.2.1\"]'"
time="2023-10-18T16:07:08Z" level=debug msg="Adding search domain(s) from network status of '[\"dns.podman\"]'"
time="2023-10-18T16:07:08Z" level=debug msg="/etc/system-fips does not exist on host, not mounting FIPS mode subscription"
time="2023-10-18T16:07:08Z" level=debug msg="Setting Cgroups for container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 to user.slice:libpod:03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601"
time="2023-10-18T16:07:08Z" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d"
time="2023-10-18T16:07:08Z" level=debug msg="Workdir \"/\" resolved to host path \"/home/nixos/.local/share/containers/storage/overlay/7aaa2baf0081f35a80c67ae3447d09eece1d46f235f7c554b1c973fef6afaea8/merged\""
time="2023-10-18T16:07:08Z" level=debug msg="Created OCI spec for container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 at /home/nixos/.local/share/containers/storage/overlay-containers/03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601/userdata/config.json"
time="2023-10-18T16:07:08Z" level=debug msg="/nix/store/1y223352pm0laijhsn7lca66qbnrg78h-conmon-2.1.7/bin/conmon messages will be logged to syslog"
time="2023-10-18T16:07:08Z" level=debug msg="running conmon: /nix/store/1y223352pm0laijhsn7lca66qbnrg78h-conmon-2.1.7/bin/conmon" args="[--api-version 1 -c 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 -u 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 -r /nix/store/54sfyqak7sn57z9gdhx4dydj7564lh6b-crun-1.8.4/bin/crun -b /home/nixos/.local/share/containers/storage/overlay-containers/03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601/userdata -p /tmp/containers-user-1000/containers/overlay-containers/03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601/userdata/pidfile -n c1 --exit-dir /tmp/podman-run-1000/libpod/tmp/exits --full-attach -s -l journald --log-level debug --syslog --conmon-pidfile /tmp/containers-user-1000/containers/overlay-containers/03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601/userdata/conmon.pid --exit-command /nix/store/z56sdzjn2s70204pc66bspdp3xi21pfr-podman-4.5.0/bin/.podman-wrapped --exit-command-arg --root --exit-command-arg /home/nixos/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /tmp/containers-user-1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /tmp/podman-run-1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/nixos/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601]"
time="2023-10-18T16:07:08Z" level=info msg="Running conmon under slice user.slice and unitName libpod-conmon-03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601.scope"
[conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied

time="2023-10-18T16:07:08Z" level=debug msg="Received: 1735264"
time="2023-10-18T16:07:08Z" level=info msg="Got Conmon PID as 1735262"
time="2023-10-18T16:07:08Z" level=debug msg="Created container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 in OCI runtime"
time="2023-10-18T16:07:08Z" level=debug msg="Attaching to container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601"
time="2023-10-18T16:07:08Z" level=debug msg="Starting container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601 with command [sleep 1h]"
time="2023-10-18T16:07:08Z" level=debug msg="Started container 03769c3dac73954eac5795a72b560605edac7046fc2a6d4838e3b9b4263fe601"
time="2023-10-18T16:07:08Z" level=debug msg="Notify sent successfully"
deliciouslytyped commented 1 year ago

Also add --log-level debug to all the podman commands, there must be a Cleaning up rootless network namespace message somewhere if it removed the rootless-netns but that should never trigger if at least one container is still running.

Sorry, I missed some of the details on this. If anything is doing any removal, then hopefully it's the command after the kill, because not much seems to show up after the first one. I'll take a look.

The only other processes that manage this are the lingering ?podman pause? - which is what does the background management? I think that's what we would need to look at? -but I don't know how to monitor that one.

Luap99 commented 1 year ago

The log says a) it created the rootless-netns and then we see the netavark setup but then b) I never see the log line were it says starting aardvark-dns. That means that aardvark-dns was still running and this kept running in old invalid namesapce. The is likely the result of incorrect aarvark-dns config files. Once you contianers are stopped aardvark-dns should exit on its own. This will however not happen if there still entries in the config file.

So you should make sure when all the contianers are stopped that aardvark-dns exists, if does not remove everything in /run/user/$UID/containers/networks/aardvark-dns/ or I guess in your case /tmp/containers-user-1000/networks/aardvark-dns/ then kill it that should make it work again.

deliciouslytyped commented 1 year ago

Yes that is correct, an aardvark process remains running. I thought this was deliberate.

$ podman stop c1
WARN[0010] StopSignal SIGTERM failed to stop container c1 in 10 seconds, resorting to SIGKILL
c1

$ podman ps -a
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES

$ ps aux | grep aardvark
...
nixos    1732299  0.0  0.0 551560   932 pts/53   S+   16:06   0:00 journalctl --user -f SYSLOG_IDENTIFIER=aardvark-dns
...
nixos    1804003  0.0  0.0 614260  1840 ?        Ssl  16:27   0:00 /nix/store/z56sdzjn2s70204pc66bspdp3xi21pfr-podman-4.5.0/libexec/podman/aardvark-dns --config /tmp/containers-user-1000/containers/networks/aardvark-dns -p 53 run
nixos    1822545  0.0  0.0   6476  2380 pts/51   S+   16:33   0:00 grep aardvark
$ ls /tmp/containers-user-1000/containers/networks/aardvark-dns
aardvark.pid  infra

$ podman network inspect infra
[
     {
          "name": "infra",
          "id": "5e631c60894333a590d08a3b8a340bcd20b2b675e6e0809dc3b99d8ce35e82bd",
          "driver": "bridge",
          "network_interface": "podman2",
          "created": "2023-09-30T07:26:21.117031605Z",
          "subnets": [
               {
                    "subnet": "10.89.1.0/24",
                    "gateway": "10.89.1.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": true,
          "labels": {
               "com.docker.compose.project": "infra",
               "io.podman.compose.project": "infra"
          },
          "ipam_options": {
               "driver": "host-local"
          }
     }
]

$ podman network inspect testnet
[
     {
          "name": "testnet",
          "id": "3f22fc652a2f266448c85b05d135794847214851d95934ad60b64caa548c66c8",
          "driver": "bridge",
          "network_interface": "podman3",
          "created": "2023-10-18T14:17:38.333271696Z",
          "subnets": [
               {
                    "subnet": "10.89.2.0/24",
                    "gateway": "10.89.2.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": true,
          "ipam_options": {
               "driver": "host-local"
          }
     }
]

infra is the network I was originally having issues with.

deliciouslytyped commented 1 year ago

I moved the aardvark config directory to keep its contents (pointless I suppose, since I ended up killing the process anyway.), and then killed any aardvark processes (not unexpectedly expected, not killing the process resulted in a second aardvark running, probably since the old PID file wasn't there anymore.).

Now after stopping all containers the lingering aardvark process seems to go away.

I don't see anything suspicious in what I pasted regarding the broken aardvark configuration. Can you see what the issue was? It was odd that it only had the config for infra in it, and not at least testnet, but then I would have expected that it doesn't cause issues in infra, which was my original problem.

Regarding the dns configurations, they look correct to me,and re: when entering the aardvark namespace I was able to properly query dns. So I expect you mean something regarding the namespace configuration?

deliciouslytyped commented 1 year ago

The contents of the old aardvark pid just in case that was somehow the problem, since I think I may or may not have pasted enough process information to see if this is wrong:

$ cat  /tmp/containers-user-1000/containers/networks/aardvark-dns/aardvark.pid
1804003
deliciouslytyped commented 1 year ago

The contents of the PID file look wrong regarding both the previous pasted systemd log and the lsns -T output:

├─4026532661 net        1 4065741 nixos unassigned      /nix/store/z56sdzjn2s70204pc66bspdp3xi21pfr-podman-4.5.0/libexec/podman/aardvark-dns --config /tmp/containers-user-1000/containers/
...
Oct 18 14:31:49 nixos aardvark-dns[4065741]: Unable to start server unable to start CoreDns server: Cannot assign requested address (os error 99)

The PID file contains 1804003 and these logs both contain 4065741.

deliciouslytyped commented 1 year ago

Do you think this is why an invalid namespace was being used? I understand that a lot of namespace referencing happens through the ns subdirectory of processes in /proc, so if the aardvark PID file is used for namespace management, and the PIDis wrong, it would make sense that there are problems there.

deliciouslytyped commented 1 year ago

I knew I would shoot myself in the foot with this. 1804003 was in fact a valid PID at some point in my scroll.

deliciouslytyped commented 1 year ago

I don't understand why, but I think I'm able to reproduce the issue by copying the presumably broken configuration directory's contents back.

deliciouslytyped commented 1 year ago

I think I figured out a small bit. The network file in the aardvark config gets removed - presumably when the last user of said network is stopped. If there remains a configuration file then the process does not stop. This would for example be the "infra" file that was still hanging around in there even though nothing was using the infra network. If I remove the lingering config file, aardvark will stop / stops when (presumably) the last container goes down.

I have some questions but this seems to be the proximate cause, so I figure I should get this out there.

deliciouslytyped commented 1 year ago

On a didactic note: This however means that aardvark is supposed to linger as long as any containers on any network are still running, so I'm not sure why aardvark not shutting down is a problem, unless you were specifically referring to my case where it stayed up even though no containers were running in which case, OK.

My question then is, if its fine for aardvark to linger after all containers in a given network are gone (but there are still containers running on another network), why exactly is it a problem if it lingers after all containers everywhere are gone? (My guess is you mean that there ends up being a namespace with no processes in it, and this somehow causes a problem? But how can I show that?)

That means that aardvark-dns was still running and this kept running in old invalid namesapce Why does the above make the namespace invalid?

How can I show that the problem is what you suspect - that aardvark / something is trying to use invalid namespaces?

deliciouslytyped commented 1 year ago

Sidenote: Is the CoreDNS error a / the real problem? If it is, I managed to find it reproduced at https://access.redhat.com/solutions/7008392 , though I don't have an account so it's secondhand information that this has the same workaround as I found.

Luap99 commented 1 year ago

Aardvark-dns leaking means at some point prior to seeing these problem the network cleanup failed for some container.

The sequence is -> failed network cleanup (aardvark config file stays around) -> aardvark-dns keep running in its original namespace -> podman cleanup logic sees that all containers are exited and removed the rootless-netns -> next container start has no rootless-netns so we create a new one -> netavark setup find aardvark pid and sees it is running, so we should send SIGHUP to make read the new conf values. Now because aardvark-dns still runs in the old now unused netns it cannot communicate with the new containers create in a different netns. Because the interfaces are created in the new netns aardvark-dns will log the bind error as it cannot bind to a non existing bridge.

This all works as I would expect, the only problem was the failed netns cleanup which did not correctly remove the aardvark-dns config entry for the container. Unless you can reproduce this exact failure there is not match that can be done. The only thing we can do is pacth netavark to compare namespaces and create an error to just kill and restart aardvark-dns.

github-actions[bot] commented 12 months ago

A friendly reminder that this issue had no activity for 30 days.

deliciouslytyped commented 12 months ago

ENOTIME for two months due to exams.