containers / aardvark-dns

Authoritative dns server for A/AAAA container records. Forwards other request to host's /etc/resolv.conf
Apache License 2.0
183 stars 32 forks source link

DNS server fallback not working #482

Closed hardcore-sushi closed 1 week ago

hardcore-sushi commented 2 months ago

Steps to reproduce:

$ podman run --rm curlimages/curl podman.io # working
$ podman network create test
$ podman run --rm --network test curlimages/curl podman.io
# > curl: (6) Could not resolve host: podman.io

Same behavior no matter whether running podman as root or not.

However, it works fine when specifying DNS server on the podman command line:

$ podman run --rm --network test --dns 9.9.9.9 curlimages/curl podman.io # works

And works fine without DNS requests (likely not a network issue):

$ podman run --rm --network test curlimages/curl -H 'Host: podman.io' http://185.199.111.153 # works

Host resolv.conf:

nameserver 9.9.9.9

Container resolv.conf:

search dns.podman
nameserver 10.89.0.1

podman network inspect test:

[
     {
          "name": "test",
          "id": "9a99ae834530d75e349ecec277590cacc4eeea7798e9d4c26a0fb5bef39b6e79",
          "driver": "bridge",
          "network_interface": "podman1",
          "created": "2024-07-25T15:23:48.419536533+02:00",
          "subnets": [
               {
                    "subnet": "10.89.0.0/24",
                    "gateway": "10.89.0.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": true,
          "ipam_options": {
               "driver": "host-local"
          },
          "containers": {}
     }
]

podman system info:

host:
  arch: arm64
  buildahVersion: 1.35.4
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-r0
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: unknown'
  cpuUtilization:
    idlePercent: 99.01
    systemPercent: 0.35
    userPercent: 0.64
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: alpine
    version: 3.20.2
  eventLogger: file
  freeLocks: 2040
  hostname: pi
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.6.41-0-rpi
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 627814400
  memTotal: 952201216
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-r0
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-r0
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.15-r0
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /tmp/storage-run-1000/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-2024.06.07-r0
    version: |
      pasta unknown version
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /tmp/storage-run-1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 1h 24m 43.00s (Approximately 0.04 days)
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 1
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/user/.local/share/containers/storage
  graphRootAllocated: 30261780480
  graphRootUsed: 4339707904
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 121
  runRoot: /tmp/containers-user-1000/containers
  transientStore: false
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 5.0.3
  Built: 1720373660
  BuiltTime: Sun Jul  7 19:34:20 2024
  GitCommit: ""
  GoVersion: go1.22.5
  Os: linux
  OsArch: linux/arm64
  Version: 5.0.3

I'm not sure if it's more of a netavark or podman issue. Let me know and I'll move the issue to the right place.

Luap99 commented 2 months ago

Does container name resolution work? I.e. is only the dns forwarding broken?

hardcore-sushi commented 2 months ago

Yes container name resolution is working properly.

Luap99 commented 2 months ago

I assume you run rootless? Does it work when running podman as root?

If it works there it is most likely a problem with the way podman set the resolv.conf up in the rootless-netns, check podman unshare cat /etc/resolv.conf and if that says the file does not exists please run podman --log-level unshare --rootless-netns stat /etc/resolv.conf

hardcore-sushi commented 2 months ago

I assume you run rootless? Does it work when running podman as root?

As stated in the first post, same behavior no matter whether running podman as root or not.

podman unshare cat /etc/resolv.conf gives the same content as the host's resolv.conf.

Luap99 commented 2 months ago

Sorry I meant podman unshare --rootless-net cat /etc/resolv.conf but anyhow if root has the same problem then this is not important.

Possibly the resolv.conf config file parsing is broken and thus we have no upstream server to forward too.

hardcore-sushi commented 2 months ago

Strangely, podman unshare --rootless-netns cat /etc/resolv.conf gives:

nameserver 169.254.0.1
nameserver 9.9.9.9
Luap99 commented 2 months ago

This is normal and expected.

hardcore-sushi commented 2 months ago

That's my bad. In had a dead DNS server at the top of my host resolv.conf. After deleting it, everything worked properly.

It looks like an issue with the DNS server fallback process.

Luap99 commented 2 months ago

I looked closer at this and fallback seems to be working but only theory, in practise the timeout used in aardvark-dns before trying the next server is higher than the client so the client just gave up and when we finally send to response the client socket was already closed.

Luap99 commented 2 months ago

The default timeout we use 5s and most clients have the same so if we wait 5s each time it will never get a chance to resolve correctly so I think we should lower the timeout to like 1s at least