containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.95k stars 2.33k forks source link

RFE: TCP DNS resolution fails with rootless container using pasta networking #23239

Closed satwell closed 1 month ago

satwell commented 1 month ago

Issue Description

When using pasta networking, TCP DNS queries to the podman network's DNS servers time out. UDP queries work fine. This seems to only affect pasta. Using slirp4netns instead works correctly.

This causes name resolution delays or failures when running anything in a container that happens to prefer TCP for DNS.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Run dig with +tcp to force use of TCP:
    • podman run --rm --network=pasta alpine:latest sh -c "apk add --quiet --no-cache bind-tools && cat /etc/resolv.conf && time dig +tcp google.com"

Describe the results you received

dig times out trying to connect to the podman network's DNS server (169.254.0.1) and then falls back to the DNS servers copied from the host's config. Which succeeds but takes a long time, 30 seconds.

nameserver 169.254.0.1
nameserver 192.168.2.11
nameserver 192.168.2.12
;; Connection to 169.254.0.1#53(169.254.0.1) for google.com failed: timed out.
;; no servers could be reached

;; Connection to 169.254.0.1#53(169.254.0.1) for google.com failed: timed out.
;; no servers could be reached

;; Connection to 169.254.0.1#53(169.254.0.1) for google.com failed: timed out.

; <<>> DiG 9.18.27 <<>> +tcp google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37731
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 573eba1b2b6bd92e01000000668d76abf197fb31596b6303 (good)
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             104     IN      A       142.250.191.78

;; Query time: 1 msec
;; SERVER: 192.168.2.11#53(192.168.2.11) (TCP)
;; WHEN: Tue Jul 09 17:43:07 UTC 2024
;; MSG SIZE  rcvd: 83

real    0m 30.08s
user    0m 0.00s
sys     0m 0.00s

Describe the results you expected

dig +tcp should be able to resolve names promptly.

podman info output

host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 93.35
    systemPercent: 1.43
    userPercent: 5.21
  cpus: 2
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: server
    version: "40"
  eventLogger: journald
  freeLocks: 2048
  hostname: dhcp-248.disjoint.net
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.8.5-301.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 2062905344
  memTotal: 4101083136
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.11.0-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.11.0
    package: netavark-1.11.0-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.11.0
  ociRuntime:
    name: crun
    package: crun-1.15-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240624.g1ee2eca-1.fc40.x86_64
    version: |
      pasta 0^20240624.g1ee2eca-1.fc40.x86_64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 4100976640
  swapTotal: 4100976640
  uptime: 0h 39m 17.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/satwell/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/satwell/.local/share/containers/storage
  graphRootAllocated: 16039018496
  graphRootUsed: 2295742464
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/satwell/.local/share/containers/storage/volumes
version:
  APIVersion: 5.1.1
  Built: 1717459200
  BuiltTime: Mon Jun  3 17:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.3
  Os: linux
  OsArch: linux/amd64
  Version: 5.1.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

I can reproduce on Fedora 40 and Fedora Core OS 40. This is on a standard install with no other containers or extra services running.

Additional information

No response

Luap99 commented 1 month ago

From the pasta docs this is expected:

--dns-forward addr Map addr (IPv4 or IPv6) as seen from guest or namespace to the first configured DNS resolver (with corresponding IP version). Mapping is limited to UDP traffic directed to port 53, and DNS answers are translated back with a reverse mapping. This option can be specified zero to two times (once for IPv4, once for IPv6).

https://passt.top/builds/latest/web/passt.1.html

I am not sure what it would take there to add tcp support.

cc @sbrivio-rh @dgibson

sbrivio-rh commented 1 month ago

From the pasta docs this is expected

Right, yes, and that's just because users requested this feature for UDP, but never for TCP... until now.

I am not sure what it would take there to add tcp support.

We need to merge this first, then it should be relatively simple to add. Give us some time.

I'd keep this ticket open to track the new feature (or we can file one on bugs.passt.top instead).

dgibson commented 1 month ago

We need to merge this first, then it should be relatively simple to add. Give us some time.

We've now merged that, and I've made a draft fix here.

@satwell if you're able to test that, that would be great.

sbrivio-rh commented 1 month ago

This is now fixed in passt version 2024_07_26.57a21d2, and in the matching Fedora 40 update.

Luap99 commented 1 month ago

Perfect timing, I just run into it while adding TCP support in aardvark-dns https://github.com/containers/aardvark-dns/pull/484