skynetservices / skydns

DNS service discovery for etcd
MIT License
2.2k stars 307 forks source link

Check/test issues like message too big and truncation #45

Open miekg opened 10 years ago

miekg commented 10 years ago

Right now it is assumed the requester's buffer is large enough. We should explicitly check for this in the code,

tmuxr commented 10 years ago

@miekg I believe the behavior I'm seeing relates to this issue.

I am running skydns in a docker container, which communicates successfully with etcd running in another docker container on a different server in my LAN.

Skydns is included in my project via git submodule, currently at 24a11ee15b73c0bd46b1766684ef3bf736a73e48 (Wed Jul 23 07:39:18 2014 +0100), which includes the 'Set the TC bit when the message gets too large' commit you mentioned in #69

Simple queries are successful:

~ # dig @172.17.0.2 hostname.subdomain.tld +short
10.1.2.3
~ #

Wildcard queries with large results are truncated, but TCP retry hangs until dig times out:

~ # dig @172.17.0.2 *.tld
;; Truncated, retrying in TCP mode.

; <<>> DiG 9.9.5-3-Ubuntu <<>> @172.17.0.2 *.tld
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
~ # 
~ # dig @172.17.0.2 *.subdomain.tld
;; Truncated, retrying in TCP mode.

; <<>> DiG 9.9.5-3-Ubuntu <<>> @172.17.0.2 *.tld
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
~ # 

For one such query, the traffic on docker0 of skydns container host looks like this (tcpdump output edited for legibility):

0.547130 IP 172.17.42.1.59876 > 172.17.0.2.53: 37656+ [1au] A? *.tld. (45)
                   /* Lots of TCP traffic between skydns and etcd */
0.726078 IP 172.17.0.2.53 > 172.17.42.1.59876: 37656*| 3823/0/0 /* Very large UDP response with lots of A records */ (61202)
0.728864 IP 172.17.42.1.49820 > 172.17.0.2.53: Flags [S]
0.728891 IP 172.17.0.2.53 > 172.17.42.1.49820: Flags [S.]
0.728909 IP 172.17.42.1.49820 > 172.17.0.2.53: Flags [.]
0.729088 IP 172.17.42.1.49820 > 172.17.0.2.53: Flags [P.] length 4739719+ [1au] A? *.tld. (45)
0.729117 IP 172.17.0.2.53 > 172.17.42.1.49820: Flags [.]
                   /* Nothing after this besides background noise - googlecast mDNS, icmp6 router solicitations */

I can tell that both 53/udp and 53/tcp are responsive, but it looks like skydns doesn't attempt to resolve the query against etcd when it is submitted via 53/tcp. It accepts the tcp handshake and acknowledges the request segment, then does nothing.

I am happy to supply additional detail as needed. Do you have any wisdom or intuition to share on this?

miekg commented 10 years ago

Try some simple debugging with println? The entry point in skydns should be ServeDNS in server.go line 79.

I may have a small hunch that I might be try setting the TC bit on tcp queries, but that should at least get you a reply.

miekg commented 10 years ago

To be sure: dig @172.17.0.2 hostname.subdomain.tld +short +tcp does work?

miekg commented 10 years ago

Pushed 20cbd0b which does not set a low bufsize for TCP queries, but that should not impact and be unrelated to this.

tmuxr commented 10 years ago

Thanks for swift response.

To be sure: dig @172.17.0.2 hostname.subdomain.tld +short +tcp does work?

With the container as described yesterday, It fails:

~ # dig @172.17.0.2 hostname.subdomain.tld +short
10.1.2.3
~ # dig @172.17.0.2 hostname.subdomain.tld +short +tcp

; <<>> DiG 9.9.5-3-Ubuntu <<>> @172.17.0.2 hostname.subdomain.tld +short +tcp
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
~ # 

During the TCP dig, traffic on docker0 is thus:

30.322331 IP digclient.46695 > skydns.53: Flags [S]
30.322362 IP skydns.53 > digclient.46695: Flags [S.]
30.322381 IP digclient.46695 > skydns.53: Flags [.]
30.322450 IP digclient.46695 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69)
30.322462 IP skydns.53 > digclient.46695: Flags [.]
             /* etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 */
40.322456 IP digclient.46695 > skydns.53: Flags [F.]
40.322525 IP digclient.39399 > skydns.53: Flags [S]
40.322547 IP skydns.53 > digclient.39399: Flags [S.]
40.322567 IP digclient.39399 > skydns.53: Flags [.]
40.322624 IP digclient.39399 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69)
40.322634 IP skydns.53 > digclient.39399: Flags [.]
40.359240 IP skydns.53 > digclient.46695: Flags [.]
             /* etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 */
50.322659 IP digclient.39399 > skydns.53: Flags [F.]
50.322732 IP digclient.50137 > skydns.53: Flags [S]
50.322755 IP skydns.53 > digclient.50137: Flags [S.]
50.322776 IP digclient.50137 > skydns.53: Flags [.]
50.322833 IP digclient.50137 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69)
50.322843 IP skydns.53 > digclient.50137: Flags [.]
50.359240 IP skydns.53 > digclient.39399: Flags [.]
             /* etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 */
09:03:00.322897 IP digclient.50137 > skydns.53: Flags [F.]
09:03:00.359250 IP skydns.53 > digclient.50137: Flags [.]

I've pulled 3a47b87 and rebuilt the container. Behavior is identical.

I'll try rebuilding with some println debugging, per your suggestion.

miekg commented 10 years ago

So basically TCP never works in your setup?

If you run another TCP service in this setup that one does work? I.e. I'm not ready believing this is a skydns problem, yet On 25 Jul 2014 15:24, "TarantulaZ" notifications@github.com wrote:

Thanks for swift response.

To be sure: dig @172.17.0.2 hostname.subdomain.tld +short +tcp does work?

With the container as described yesterday, It fails:

~ # dig @172.17.0.2 hostname.subdomain.tld +short 10.1.2.3 ~ # dig @172.17.0.2 hostname.subdomain.tld +short +tcp

; <<>> DiG 9.9.5-3-Ubuntu <<>> @172.17.0.2 hostname.subdomain.tld +short +tcp ; (1 server found) ;; global options: +cmd ;; connection timed out; no servers could be reached ~ #

During the TCP dig, traffic on docker0 is thus:

30.322331 IP digclient.46695 > skydns.53: Flags [S] 30.322362 IP skydns.53 > digclient.46695: Flags [S.] 30.322381 IP digclient.46695 > skydns.53: Flags [.] 30.322450 IP digclient.46695 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69) 30.322462 IP skydns.53 > digclient.46695: Flags [.] /* etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 / 40.322456 IP digclient.46695 > skydns.53: Flags [F.] 40.322525 IP digclient.39399 > skydns.53: Flags [S] 40.322547 IP skydns.53 > digclient.39399: Flags [S.] 40.322567 IP digclient.39399 > skydns.53: Flags [.] 40.322624 IP digclient.39399 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69) 40.322634 IP skydns.53 > digclient.39399: Flags [.] 40.359240 IP skydns.53 > digclient.46695: Flags [.] / etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 / 50.322659 IP digclient.39399 > skydns.53: Flags [F.] 50.322732 IP digclient.50137 > skydns.53: Flags [S] 50.322755 IP skydns.53 > digclient.50137: Flags [S.] 50.322776 IP digclient.50137 > skydns.53: Flags [.] 50.322833 IP digclient.50137 > skydns.53: Flags [P.] A? hostname.subdomain.tld. (69) 50.322843 IP skydns.53 > digclient.50137: Flags [.] 50.359240 IP skydns.53 > digclient.39399: Flags [.] / etcd and skydns exchange TCP ack's at 1-second intervals, all with length 0 */ 09:03:00.322897 IP digclient.50137 > skydns.53: Flags [F.] 09:03:00.359250 IP skydns.53 > digclient.50137: Flags [.]

I've pulled 3a47b87 and rebuilt the container. Behavior is identical.

I'll try rebuilding with some println debugging, per your suggestion.

— Reply to this email directly or view it on GitHub https://github.com/skynetservices/skydns/issues/45#issuecomment-50156069 .

tmuxr commented 10 years ago

Well, I'm running sshd in the skydns container for convenience during development, and that works fine:

~ # ssh root@172.17.0.2
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
Last login: Fri Jul 25 14:44:19 2014 from 172.17.42.1
root@86510c9571d1:~# 
miekg commented 10 years ago

OK. The a println is ServeDNS should tell you more. Or debugging with gdb On 25 Jul 2014 15:46, "TarantulaZ" notifications@github.com wrote:

Well, I'm running sshd in the skydns container for convenience during development, and that works fine:

~ # ssh root@172.17.0.2 Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)

— Reply to this email directly or view it on GitHub https://github.com/skynetservices/skydns/issues/45#issuecomment-50159621 .

miekg commented 10 years ago

Also we should add the OPT RR even when we truncate. I thin there needs to be support for this in go dns base library.

mariusgrigaitis commented 9 years ago

Maybe this is related: https://github.com/kelseyhightower/confd/issues/285

davidxia commented 8 years ago

Not entirely sure if this is related, but we saw this issue: https://github.com/spotify/helios/pull/899

miekg commented 8 years ago

[ Quoting notifications@github.com in "Re: [skynetservices/skydns] Check/t..." ]

Not entirely sure if this is related, but we saw this issue: https://github.com/spotify/helios/pull/899

"When this type of response reaches skydns, skydns blows up and doesn't tell the client about the error."

Yes, that is a bug that needs to be fixed. Thought I propagate this error back to the client, apparently not.

/Miek

Miek Gieben

davidxia commented 8 years ago

@miekg Thanks for replying. A confirmation that this is indeed the issue and a fix would be amazing.

ghost commented 7 years ago

There is a definite ongoing problem with SkyDNS that really upsets Ruby's Resolv::DNS class.

Ruby sends no EDNS0 information about UDP message size so according to the rfc1035 the response UDP message must be 512 bytes only. It does, however, honour retry over TCP if it receives a UDP message with the TC (Truncated) flag set and it does this automatically.

The ongoing problems are.

SkyDNS returns a larger UDP response irrespective of the fact no EDNS0 tags are in the request, not standards compliant but if we assume we are only interested in the flags and various information at the beginning of the message this shouldn't be too much of a problem.

Ruby, in lib/resolv.rb, insists on reading all answers claimed by the ANCOUNT field provided even when sent a fragment. It seems that SkyDNS sends a count matching all records it knows about and not the amount that it managed to Fit in the response.

In order for Ruby to work at all with the fragmented response and move on to the TCP behaviour it needs both the 512 limit to be observed and the ANCOUNT to reflect the number of answers in the message itself.

I posted a workaround to the ruby:trunk bugtacker here yesterday https://bugs.ruby-lang.org/issues/13513 with a Monkey patch that works around this problem but really the problem looks to stem from SkyDNS behaviour.

miekg commented 7 years ago

[ Quoting notifications@github.com in "Re: [skynetservices/skydns] Check/t..." ]

There is a definite ongoing problem with SkyDNS that really upset's Ruby's Resolv::DNS class.

If looked at the ruby code a long time back and it was really basic at time time.

Ruby sends no EDNS0 information about UDP message size so according to the rfc1035 the response UDP message must be 512 bytes only. It does, however, honour retry over TCP if it receives a UDP message with the TC (Truncated) flag set and it does this automatically.

OK, that's good

The ongoing problems are.

SkyDNS returns a larger UDP response irrespective of the fact no EDNS0 tags are in the request, not standards compliant but if we assume we are only interested in the flags and various information at the beginning of the message this shouldn't be too much of a problem.

technically, the client's buffer is only so large, although SkyDNS shouldn't it's not the end of the world.

Ruby, in lib/resolv.rb, insists on reading all answers claimed by the ANCOUNT field provided even when sent a fragment. It seems that SkyDNS sends a count matching all records it knows about and not the amount that it managed to Fit in the response.

That is both a bug in SkyDNS and ruby (and for ruby it might also be a resources dos)

In order for Ruby to work at all with the fragmented response and move on to the TCP behaviour it needs both the 512 limit to be observed and the ANCOUNT to reflect the number of answers in the message itself.

I posted a workaround to the ruby:trunk bugtacker here yesterday https://bugs.ruby-lang.org/issues/13513 with a Monkey patch that works around this problem but really the problem looks to stem from SkyDNS behaviour.

I believe CoreDNS has better, truncation behavior.

/Miek

-- Miek Gieben

ghost commented 7 years ago

The Ruby default behaviour when you consider it closer it isn't buggy.

Let's consider the expectation from a truncated response with standard (non EDNS) UDP.

The server should send a response with all records it includes being the count of whatever fields recorded and within only a 512 byte UDP section of the frame. Noting that it sets TC to indicate that it was truncated and there's more but the error code it NO error because the response is legitimate for some purposes.

It's actually up to the application to determine if this is suitable and to use the truncated UDP response.

The rfc's state that mail servers must NOT use truncated responses with MX records in for good reason but beyond that the reponse may be suitable and should stand alone as a properly formatted response with the correct counts in the respective fields for all the queries and answers given.

Now, many libraries like Ruby's standard library take the TC field as an indication that they should for the convenience of the developer proceed to using TCP to make the query. This does not invalidate the necessity for correctness in the UDP response for applications that deem that the truncated response is appropriate.

Ruby's approach may lack robustness for malformed responses but if the responses are rfc compliant then there is no problem.

ghost commented 7 years ago

Additionally the issue around UDP frames larger than 512 bytes for non EDNS0 responses could cause odd problems for client libraries that only read 512 bytes since the UDP section may well contain all the stated results and no TC set because it fits but they client system won't be able to read the results. This could cause all sorts of odd problems for such clients when they encounter responses that sit within that area of a larger than 512 byte UDP message and one that SkyDNS hasn't needed to truncate.

miekg commented 7 years ago

[ Quoting notifications@github.com in "Re: [skynetservices/skydns] Check/t..." ]

Additionally the issue around UDP frames larger than 512 bytes for non EDNS0 responses could cause odd problems for client libraries that only read 512 bytes since the UDP section may well contain all the stated results and no TC set because it fits but they client system won't be able to read the results. This could cause all sorts of odd problems for such clients when they encounter responses that sit within that area of a larger than 512 byte UDP message and one that SkyDNS hasn't needed to truncate.

Don't think the DNS standard actually says something about a client using a truncated query, except for seeing the error.

Could I ask you to check CoreDNS, it is (should) be more robust in this regards and I'm spending most of more spare time on that now a days.

ghost commented 7 years ago

I could test with CoreDNS but I can't impose that change here.

On the subject of Truncation the relevant bits are in rfc1123 section 6.1.3.2.

The suggestion is that the resolver SHOULD switch and retry with TCP on a truncated response but the discussion then states.

            "Whether it is possible to use a truncated answer
             depends on the application.  A mailer must not use a
             truncated MX response, since this could lead to mail
             loops.

             Responsible practices can make UDP suffice in the vast
             majority of cases.  Name servers must use compression
             in responses.  Resolvers must differentiate truncation
             of the Additional section of a response (which only
             loses extra information) from truncation of the Answer
             section (which for MX records renders the response
             unusable by mailers).  Database administrators should
             list only a reasonable number of primary names in lists
             of name servers, MX alternatives, etc."

In practice I think this is more likely to happen with less common resolver implementations or implementations of resovers that defer the decision to try TCP to the application author.

miekg commented 7 years ago

[ Quoting notifications@github.com in "Re: [skynetservices/skydns] Check/t..." ]

I could test with CoreDNS but I can't impose that change here.

Ack. Happy to take a PR for SkyDNS.

On the subject of Truncation the relevant bits are in rfc1123 section 6.1.3.2.

The suggestion is that the resolver SHOULD switch and retry with TCP on a truncated response but the discussion then states.

          "Whether it is possible to use a truncated answer
            depends on the application.  A mailer must not use a
            truncated MX response, since this could lead to mail
            loops.

            Responsible practices can make UDP suffice in the vast
            majority of cases.  Name servers must use compression
            in responses.  Resolvers must differentiate truncation
            of the Additional section of a response (which only
            loses extra information) from truncation of the Answer
            section (which for MX records renders the response
            unusable by mailers).  Database administrators should
            list only a reasonable number of primary names in lists
            of name servers, MX alternatives, etc."

Interesting. Thanks for the RFC reference

/Miek

-- Miek Gieben