EFForg / OpenWireless

The official home of the EFF OpenWireless Project
Other
732 stars 80 forks source link

Some DNS names fail to resolve #225

Open jsha opened 10 years ago

jsha commented 10 years ago

Steps to reproduce:

  $ nslookup mozilla.org 172.30.42.1
;; Truncated, retrying in TCP mode.
;; communications error to 172.30.42.1#53: end of file

$ nslookup google.com 172.30.42.1
Server:         172.30.42.1
Address:        172.30.42.1#53

Non-authoritative answer:
Name:   google.com
Address: 173.194.68.101
Name:   google.com
Address: 173.194.68.102
Name:   google.com
Address: 173.194.68.138
Name:   google.com
Address: 173.194.68.113
Name:   google.com
Address: 173.194.68.139
Name:   google.com
Address: 173.194.68.100

It's strange that only mozilla.org fails to resolve. And if you log on to the router, they resolve successfully. Note that both mozilla.org and google.com have an AAAA (IPv6) response in addition to the A (IPv4) response, so that's not a factor:

root@cerowrt:~# nslookup mozilla.org 172.30.42.1
Server:    172.30.42.1
Address 1: 172.30.42.1

Name:      mozilla.org
Address 1: 2620:101:8008:5::2:1 bedrock-prod.zlb.phx.mozilla.net
Address 2: 63.245.215.20 bedrock-prod-zlb.vips.scl3.mozilla.com

root@cerowrt:~# nslookup google.com 172.30.42.1
Server:    172.30.42.1
Address 1: 172.30.42.1

Name:      google.com
Address 1: 2607:f8b0:400d:c06::8b qh-in-x8b.1e100.net
Address 2: 74.125.22.113 qh-in-f113.1e100.net
Address 3: 74.125.22.139 qh-in-f139.1e100.net
Address 4: 74.125.22.101 qh-in-f101.1e100.net
Address 5: 74.125.22.102 qh-in-f102.1e100.net
Address 6: 74.125.22.138 qh-in-f138.1e100.net
Address 7: 74.125.22.100 qh-in-f100.1e100.net

The next step to debug would probably be to take a packet capture of the nslookup using e.g. Wireshark. If you take a capture of the lookup on a non-OpenWireless network as well you should be able to see the difference between success and failure.

jsha commented 10 years ago

image

I took a packet capture and viewed it in Wireshark. On the left is a failed request from the OpenWireless public network (also repros on the private network). On the right is a successful request from the upstream network. The only difference between the two is that the flag 'Truncated: Message is truncated' is set on the failed request. Which is odd since both responses are the same length.

Since this is a DNS-layer problem, I'm leaning towards the idea that it might be an issue in dnsmasq. Still not sure.

RussellSenior commented 10 years ago

This looks like it might be relevant: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2009q2/002962.html

davidstrauss commented 10 years ago

Would it be possible to switch to Unbound instead of dnsmasq? The DNSSec validation would be a great addition to the platform.