Closed drphr4ud closed 7 years ago
As aeris advised on Twitter, I don't have these issues when installing unbound and using out-of-the-box (without the script). So obviously this is an issue with the configuration/the installation script.
Also it seems the script is useless, on Debian at least :)
Sp3r4z found the issue : it was use-caps-for-id
, which is an experimental feature.
Tested and confirmed that removing
use-caps-for-id: yes
from unbound.conf resolved the issue!
The problem is not in Unbound, or in the Debian package. use-caps-for-id is perfectly legitimate, since DNS is and has always been case-INsensitive.
No, the problem is that dnsleak.net name servers are deeply broken:
% dig @dns1.dnsleak.net A ipleak.net
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @dns1.dnsleak.net A ipleak.net
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61570
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ipleak.net. IN A
;; ANSWER SECTION:
ipleak.net. 3600 IN AAAA 2a03:b0c0:0:1010::509:d001
ipleak.net. 3600 IN A 95.85.16.212
;; Query time: 25 msec
;; SERVER: 2a03:b0c0:0:1010::509:d001#53(2a03:b0c0:0:1010::509:d001)
;; WHEN: Sat Oct 28 11:58:21 CEST 2017
;; MSG SIZE rcvd: 72
% dig @dns1.dnsleak.net A IPleak.net
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @dns1.dnsleak.net A IPleak.net
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24072
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;IPleak.net. IN A
;; Query time: 26 msec
;; SERVER: 2a03:b0c0:0:1010::509:d001#53(2a03:b0c0:0:1010::509:d001)
;; WHEN: Sat Oct 28 11:58:26 CEST 2017
;; MSG SIZE rcvd: 28
They don't return data when the case change. That's an awful violation of DNS case-insensitivity. Unbound was right to reject it.
Thanks @bortzmeyer. Should we using use-caps-for-id
then? I understand it's used to foil spoof attempts.
@Angristan Yes, use-caps-for-id
is a (limited) protection against spoofing attempts. It is documented in the draft "Use of Bit 0x20 in DNS Labels to Improve Transaction Identity" You should not disable it just because there are broken servers on the Internet.
The problem is there seem to be many broken servers on the internet.
Lots of stuff broke. Not just obscure little fringe cases like ipleak.net.
I just used ipleak.net as an example in the report as it is short and easy to remember.
@drphr4ud removing use-caps-for-id
resolved the issues you had with all those domains?
Yes it did.
I see what bortzmeyer said is 100% correct
dig -t A iPLEaK.NeT
returns nothing, but it should!
Google had the same problem it seems and found 70% of their DNS traffic gets RFC compliant responses but 30% does not. They made a white-list to work around it.
Our current solution to this problem is to create a whitelist of name servers which we know apply the standards correctly, and to only apply the case randomization technique in requests to those servers.
I have trouble believing the problem is so common ("30 %"). At home, I use a resolver with the 0x20 trick (Knot Resolver) and, while ipleak.net indeed does not work, not me, nor one of the two non-geek users noticed anything (and, believe me, they are quick to report problems).
Any other example of problem in the real world? Which domain?
I also never noticed any issue but ipleak.net.
Like I said, half the apps on my Roku would not work anymore when the Roku used unbound resolver with use-caps-for-id: yes
set
IIRC Hulu and Vudu had issues resolving their CDN servers. To reiterate: I had massive usability problems and ipleak.net was just mentioned because its easy to remember.
I do not work for Google so no idea how accurate their numbers are but they say that overall across 8.8.8.8, 8.8.4.4 and their entire public DNS traffic:
Our current solution to this problem is to create a whitelist of name servers which we know apply the standards correctly, and to only apply the case randomization technique in requests to those servers. We also list the appropriate exception subdomains for each of them, based on analyzing our logs. If a response that appears to come from those servers does not contain the correct case, we reject the response.
The whitelisted name servers comprise more than 70% of our traffic.
@drphr4ud You say so but you do not provide even one extra name (besides ipleak.net) of a domain that fails to resolve.
Because I am not at the site where I can break the config again and make it fail and run wireshark to see what DNS queries are made....
The issues were what prompted me to log this issue. ipleak.net came into play for me much later in the process. My first obersvation was:
I use unbound with use-caps-for-id: yes
enabled and various stuff broke. Applications claimed I am not connected to the internet. Sharp TV wouldn't check for firmware updates and lock up.
Set DHCP Server to let these devices use 8.8.8.8 and 8.8.4.4 instead and they all worked again from that moment. Changed them back to use unbound and they died again.
Somewhere along the way I noticed that one domain that doesn't work is ipleak.net
Installed
https://github.com/Angristan/Local-DNS-resolver/blob/master/ubuntu-unbound.sh on Ubuntu 16.04
also tried https://github.com/Angristan/Local-DNS-resolver/blob/master/centos-unbound.sh on CentOS 7.
Install succeeded. Service starts ok and is responsive:
As far as I can tell, I can usually resolve unsigned domains:
Most DNSSEC signed domains resolve OK, too:
Stuff that should fail also tends to fail:
However,
some lookups fail and I have no idea why.
Does not seem to matter if the domain is signed or not.
I first noticed that I can't visit http://ipleak.net anymore
Then half the apps on my Roku claimed they have no connectivity because lookups failed.
It returns NOERROR but then doesn't provide a response.
Compare with:
First I thought it may just be an Ubuntu thing. But it happens on CentOS, too. Then I thought it may be some root servers refuse queries from some of my hosts (Vultr netblock). But I ended up setting up on a bunch of other hosts on Softlayer, DO, etc. in various regions and the issue persists in all cases.
What's the best way to troubleshoot this ?
Some people with similar issues blamed UDP fragmentation as the culprit. I tried
edns-buffer-size: 1280
in unbound.conf but it did not help.