docker / for-mac

Bug reports for Docker Desktop for Mac
https://www.docker.com/products/docker#/mac
2.44k stars 119 forks source link

DNS proxy does not truncate UDP responses correctly #2160

Open matthiasr opened 7 years ago

matthiasr commented 7 years ago

Expected behavior

Any DNS record that can be resolved on the host can be resolved inside containers, reliably.

Large responses over UDP should be truncated, with the truncation flag set, so that clients know they should retry over TCP.

Actual behavior

Large DNS responses are very unreliable. At ~300 A records, maybe every other response packet never arrives. At a CNAME to the same 300 A records, several attempts are necessary until one succeeds; about half the time the number of retries is exhausted and the lookup fails. Looking up a CNAME to a CNAME to 300 A records works basically never. 400 A records in a single response work basically never.

We use DNS extensively, so large DNS records are not uncommon for us. This was even worse (none of the above cases worked at all) in 17.06, with 17.09 some of them work some of the time. The DNS server is behind a VPN.

I packet dumped this; I can share PCAPs privately if needed. I don't know of any public DNS servers that produce such large records – if you know any I'm happy to produce a clean PCAP that I can share.

What I observed is that for these large records, the DNS proxy falls back to TCP every time (as it should), while the container receives a UDP packet. In the stage where things get wonky, these packets are just under 1500 bytes, varying slightly in each response (presumably due to reordering of the records & compression).

For the records that only work from time to time, the ones that come through are at 1508 bytes including the Ethernet header. That means they're edging up on the 1500 byte MTU of the ethernet devices between the host and the VM, as well as the VM and the container. Based on the smaller records I assume that there is some spread of the response length, but only those responses that actually fall under the magic 1514 bytes make it through.

As far as I can see, the problem is that Docker for Mac sends too-large packets instead of truncating the DNS response and offering TCP fallback.

Information

Docker for Mac: version: 17.09.0-ce-mac35 (69202b202f497d4b6e627c3370781b9e4b51ec78)
macOS: version 10.11.6 (build: 15G1611)
logs: /tmp/EE9D4130-CC1E-48FA-AC53-DD8616C1F700/20171019-161738.tar.gz
[OK]     db.git
[OK]     vmnetd
[OK]     dns
[OK]     driver.amd64-linux
[OK]     virtualization VT-X
[OK]     app
[OK]     moby
[OK]     system
[OK]     moby-syslog
[OK]     db
[OK]     env
[OK]     virtualization kern.hv_support
[OK]     slirp
[OK]     osxfs
[OK]     moby-console
[OK]     logs
[OK]     docker-cli
[OK]     menubar
[OK]     disk

Diagnostic ID: EE9D4130-CC1E-48FA-AC53-DD8616C1F700

See below the fold.

Steps to reproduce the behavior

reproduction/packet dump script ``` #!/usr/bin/env bash set -x d="$(date +%Y-%m-%d_%H:%M)" exec &> >(tee "log.txt") sudo true docker version cat > Dockerfile << 'EOF' FROM ubuntu RUN apt-get update RUN apt-get install -y dnsutils iproute2 iputils-ping tcpdump curl RUN apt-get clean EOF docker build -t dnstest . docker ps -a --filter label=dnstest -q | xargs docker rm -f docker run --rm --name dnstest-0 -l dnstest --net host -v "$(pwd):/mnt" dnstest tcpdump -i eth0 -w "/mnt/${d}.docker-vm.pcap" port 53 & docker run --rm --name dnstest-1 -l dnstest -v "$(pwd):/mnt" dnstest tcpdump -w "/mnt/${d}.container.pcap" port 53 & sudo tcpdump -i utun0 -w "${d}.mac-host.pcap" port 53 & sleep 10 host=300-a-records.example.com docker run --rm --name "dnstest-$host-nslookup" -l dnstest --network container:dnstest-1 dnstest bash -c "for i in `seq 1 10 | xargs`; do nslookup '$host'; done" host=cname-to-300-a-records.example.com docker run --rm --name "dnstest-$host-nslookup" -l dnstest --network container:dnstest-1 dnstest bash -c "for i in `seq 1 10 | xargs`; do nslookup '$host'; done" host=cname-to-cname-to-300-a-records.example.com docker run --rm --name "dnstest-$host-nslookup" -l dnstest --network container:dnstest-1 dnstest bash -c "for i in `seq 1 10 | xargs`; do nslookup '$host'; done" sleep 10 kill %sudo docker ps -a --filter label=dnstest -q | xargs docker stop ```
matthiasr commented 7 years ago

I didn't think to test this earlier, but it is easy to confirm that TCP lookups work (dig +tcp huge-record.example.com reliably works), they just never get triggered because no truncated UDP responses arrive.

matthiasr commented 7 years ago

PPS: it is possible that there was no actual change that relates to this between 17.06 and 17.09, as the record that this most hurts on is just on the edge and may have been a little smaller in the past.

alexandruionica commented 6 years ago

I've encountered this bug last week with docker-ce=17.09.0~ce-0~ubuntu . This bug was observed with a DNS record having 32 entries. When a client outside of Docker tries to resolve that name, it gets a reply that the response is truncated so it switches to a TCP connection and queries again the server. When using Docker and user mode networking, the initial request from a client (in a container) is replied over UDP with 30 answers so 2 random entries are truncated.

docker-robott commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale comment. Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale

alexandruionica commented 6 years ago

/remove-lifecycle stale

jijojv commented 6 years ago

ran into this bug testing git-lfs on centos 7

docker-robott commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale comment. Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale

dziemba commented 6 years ago

/remove-lifecycle stale

dziemba commented 6 years ago

This is still a huge issue for us, it prevents any engineer in our company from using docker-for-mac for development.

I created a testing DNS server to make it easier for everybody to recreate the scenario:

dig A smalldns.test.dziemba.net
dig A hugedns.test.dziemba.net
dig +tcp A hugedns.test.dziemba.net

All these commands should run fine. The smalldns set contains 3 records, hugedns contains 420.

These commands work fine directly on Linux/MacOS and with a docker-machine/virtualbox docker setup under MacOS. Only on docker-for-mac (tested on 18.06.1-ce-mac73) the issues describe above appear:

Let me know if you need any further information!

Habbie commented 6 years ago

If somebody is going to look at the DNS proxy, here's another bug:

Outside Docker:

$ dig a smalldns.test.dziemba.net

; <<>> DiG 9.12.2-P1 <<>> a smalldns.test.dziemba.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33506
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;smalldns.test.dziemba.net. IN  A

;; AUTHORITY SECTION:
test.dziemba.net.   5   IN  SOA test.dziemba.net. admin.example.com. 5 30 30 30 30

;; Query time: 34 msec
;; SERVER: 62.179.104.196#53(62.179.104.196)
;; WHEN: Tue Nov 27 20:47:07 CET 2018
;; MSG SIZE  rcvd: 107

Note the NOERROR status, which is correct - the name exists, just the type does not.

Inside a docker container (debian:9 with apt-get install dnsutils):

# dig A smalldns.test.dziemba.net +tcp

; <<>> DiG 9.10.3-P4-Debian <<>> A smalldns.test.dziemba.net +tcp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 431
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;smalldns.test.dziemba.net. IN  A

;; Query time: 32 msec
;; SERVER: 192.168.65.1#53(192.168.65.1)
;; WHEN: Tue Nov 27 19:46:37 UTC 2018
;; MSG SIZE  rcvd: 43

Note the NXDOMAIN status which is wrong - the name does exist!

docker-robott commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale comment. Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale

Habbie commented 5 years ago

Docker Desktop (Mac), Community, Version 2.0.0.3 (31259), Channel: stable, 8858db33c8, Engine: 18.09.2.

Issue still present.

dziemba commented 5 years ago

/remove-lifecycle stale

djs55 commented 5 years ago

@matthiasr thanks very much for the clear bug report and repro scripts. I believe the error is in the DNS forwarder inside https://github.com/moby/vpnkit so I've created a candidate fix and a unit test for it.

I'll keep you all informed of progress.

djs55 commented 5 years ago

There's a development build with the candidate fix in, if you'd like to try it: https://download-stage.docker.com/mac/edge/32461/Docker.dmg

When I try @dziemba 's example queries the UDP -> TCP fallback seems to work:

/ # dig SRV hugedns.test.dziemba.net
;; Truncated, retrying in TCP mode.
...
;; ANSWER SECTION:
hugedns.test.dziemba.net. 7 IN  SRV 0 5 8080 www404.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.example.com.
...

However I notice only 55 records are returned rather than 420. So I think the TCP fallback issue is fixed but there appears to be a separate issue about the total number of records returned.

Edited to add: The separate issue about the total number of records returned is Mac-specific. On Windows I see the full 420 records.

pievis commented 4 years ago

Error is still present to latest version and makes impossible to work with VPN settings, no workaround yet found that works. @djs55 sadly your build link is broken :(

details:

{
  "id": "750BBD41-DE00-4A94-B1AD-4B026218DE43",
  "date": "2020-11-10 19:52:06.008933 +0000 UTC",
  "os": "macOS 10.15.7",
  "os_label": "osx/10.15.x",
  "app_version": "2.5.0.1",
  "app_channel": "stable",
  "engine_version": "19.03.13",
  "compose_version": "1.27.4",
  "kubernetes_version": "v1.19.3",
  "credhelper_version": "0.6.3",
  "notary_version": "0.6.1",
  "vpnkit_version": "ea9dbeaf887f5dad8391f4a34d127501fb6bbf64",
  "hyperkit_version": "v0.20200224-44-gb54460"
}
kevinAlbs commented 3 months ago

I may be experiencing a related issue. When running SRV lookup in a Docker container with a host using Cloudflare WARP, results appear missing.

% warp-cli --version
warp-cli 2024.6.474.0
% docker run --rm -it alpine:3.19
/ # apk add --quiet bind-tools
/ # dig +short SRV _mongodb._tcp.test1.kevinalbs.com | wc -l
9

30 records are expected. When run on the host:

 dig +short SRV _mongodb._tcp.test1.kevinalbs.com | wc -l
      30

A packet capture on the container suggests the results are truncated, however the TC bit is not set.