Closed aharelick closed 2 years ago
Could you please verify if the problem still persists with the latest version of Docker Desktop?
This might be related to docker/for-win#12018, although we have not yet managed to reproduce that one.
@joe0BAB, yup I can confirm that I'm seeing the same issue on version 4.1.1.
@stephen-turner, that issue does seem very similar to what I described. Let me see if I can put together a reproduction.
Alright, I believe I was able to reproduce this with a public domain. I own some random domain freespaceheaters.com so I put a super long TXT record on it that everyone should be able to query and get the same result from.
~ dig freespaceheaters.com TXT
; <<>> DiG 9.16.4 <<>> freespaceheaters.com TXT
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32270
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;freespaceheaters.com. IN TXT
;; ANSWER SECTION:
freespaceheaters.com. 255 IN TXT "superlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesup" "erlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperl" "ongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvaluesuperlongtextvalue"
freespaceheaters.com. 1755 IN TXT "v=spf1 include:spf.efwd.registrar-servers.com ~all"
;; Query time: 7 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu Oct 14 12:36:14 PDT 2021
;; MSG SIZE rcvd: 811
The response is well over 512 bytes. The exact container setup that we're using is ruby:2.3-alpine3.7
. So, you can do:
$ docker run --rm -it ruby:2.3-alpine3.7 sh
Then in the shell run irb
to get the interactive ruby console.
Then you can run:
require 'resolv' # to load the dns library
# this command should give you the proper output with "superlongtextvaluesuper....", because it specifies the cloudflare dns server and goes around the docker proxy (feel free to replace 1.1.1.1 with 8.8.8.8 or your public DNS resolver of choice)
Resolv::DNS.new(:nameserver => ['1.1.1.1']).getresources "freespaceheaters.com", Resolv::DNS::Resource::IN::TXT
# this command should give you an error as it uses the proxy. I don't think the exception is meaningful, it's just a byproduct of it not being able to parse the response.
Resolv::DNS.new.getresources "freespaceheaters.com", Resolv::DNS::Resource::IN::TXT
Let me know if any of that isn't working as expected.
Thanks for that, @aharelick. I can reproduce your result.
I'm now not convinced this is the same bug as docker/for-win#12018 because I can ping or nslookup freespaceheaters.com from the shell of an alpine container. My current belief is that your bug was probably caused by the same change on our end, but is a different bug, specific to Ruby.
That sounds right to me, probably a specific bug on the docker side that's having varying impacts on different DNS resolvers depending on what they support and how they handle large packets. Is it fair to consider this a docker bug at this point or is there anything else I can provide?
We have enough to investigate, thank you.
Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
comment.
Stale issues will be closed after an additional 30 days of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale
Closed issues are locked after 30 days of inactivity. This helps our team focus on active issues.
If you have found a problem that seems similar to this, please open a new issue.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle locked
Apologies for not filling out the template, this is more of a request for information and I'm not exactly sure what the proper expected behavior should be. The change we've noticed is that in version 3.6 and later of Docker for Mac we are now unable to resolve certain domain names. We're using ruby 2.3 and the resolv library library running in an alpine container (ruby:2.3-alpine3.7). However, we can reproduce the issue with debian as well.
The issue only happens when the DNS responses are large. We assume that this has something to do with EDNS support, but there are a ton of issues/tickets floating around on that topic so it'd be helpful to get some more information on what changes were made to help us understand if we should make a bug report here or in the ruby project or if this is just expected behavior. Specifically, this line in the release notes caught our eye:
The actual problem we're encountering is that the Ruby resolv library doesn't support EDNS so it can't handle UDP packet responses that are larger than 512 bytes. It seems like the behavior before Docker for mac 3.6 was that something (possibly the docker dns proxy) would recognize that the response was larger than 512 bytes and would truncate it. Then the DNS request was re-made over TCP. Here is a TCP dump of the DNS request sent in Docker for Mac 3.5.2.
It seems that with version 3.6, packets larger than 512 bytes are being allowed and since the resolv library doesn't support EDNS, it's having trouble parsing them. Here's an example of the tcp dump from version 3.6, in this case the packet isn't truncated so the resolver doesn't fallback to TCP:
I guess the expected behavior would be that if the dns requester (in this case the ruby library) doesn't support EDNS, then the docker for mac proxy would continue to truncate at 512 bytes, but it's very possible I could be misunderstanding the specifics of EDNS or how the proxy fits into this.
Also, as another note, this problem doesn't happen if we explicitly specify our DNS server in our ruby code instead of using the
192.168.65.5
dns proxy (that then forwards to our DNS server). That is what made us fairly certain it was some interaction with the Docker for mac dns setup. When we specify our DNS server the response is truncated and the ruby library re-requests over TCP similar to the behavior in 3.5.2.