Closed rprimus closed 5 years ago
Try refused_code_in_responses = true
.
Wed Mar 27 09:31:14 GMT 2019
refused_code_in_responses = true
was set before having the problems with openbsd.org
.
: ; egrep -v '^$|^ *#' /usr/local/etc/dnscrypt-proxy.toml listen_addresses = ['127.0.0.1:50', '[::1]:50'] max_clients = 250 ipv4_servers = true ipv6_servers = true dnscrypt_servers = true doh_servers = true require_dnssec = true require_nolog = true require_nofilter = true disabled_server_names = [] force_tcp = false timeout = 2500 keepalive = 30 refused_code_in_responses = true log_level = 0 log_file = '/var/log/dnscrypt-proxy.log' cert_refresh_delay = 240 fallback_resolver = '9.9.9.9:53' ignore_system_dns = true netprobe_timeout = 60 log_files_max_size = 10 log_files_max_age = 7 log_files_max_backups = 1 block_ipv6 = false cache = false cache_size = 512 cache_min_ttl = 600 cache_max_ttl = 86400 cache_neg_min_ttl = 60 cache_neg_max_ttl = 600 [query_log] file = '/var/log/dnscrypt-query.log' format = 'tsv' [nx_log] file = '/var/log/dnscrypt-nx.log' format = 'tsv' [blacklist] [ip_blacklist] [whitelist] [schedules] [sources] [sources.'public-resolvers'] urls = ['https://raw.githubusercontent.com/DNSCrypt/dnscrypt-resolvers/master/v2/public-resolvers.md', 'https://download.dnscrypt.info/resolvers-list/v2/public-resolvers.md'] cache_file = 'public-resolvers.md' minisign_key = 'RWQf6LRCGA9i53mlYecO4IzT51TGPpvWucNSCh1CBM0QTaLn73Y7GFO3' refresh_delay = 72 prefix = '' [static]
Duplicate of #774
Fixed in 2.0.22.
Mon Apr 1 08:20:47 BST 2019
Thanks for the fix. Could you answer the following (for future reference)?
org
tld?Queries for just org
(no domain at all, just org
) never happen with real clients. This kind of query only happens when a recursive resolver is put in front of dnscrypt-proxy.
Which is why I never noticed that issue, and couldn't initially reproduce it.
When asking just for org
and with DNSSEC enabled, the response is huge. It probably wasn't originally that big, but recently got bigger.
In fact, too big to fit in a normal, UDP packet, without some extra care. So, the proxy sends a truncated response, as defined in the DNS protocol. And when a truncated response is received, the client, or here, dnsmasq, should retry using TCP. Because TCP is slower, but can accept larger packets without special care.
The problem was the truncated response got cached. So even when you retried using TCP, you received a truncated response, and not the full response.
dnsmasq having its own cache, the actual issue is not straightforward to understand. You can restart dnscrypt-proxy, but dnsmasq still has the truncated response in its own cache. In this context, using ping
or a web browser, sitting between these two layers to diagnose the behavior, is not very helpful as the side effects of that bug appear not well-defined and quite unpredictable.
By far the best way to debug this would have been to enable query logging in dnscrypt-proxy. You would have seen two queries for ORG
(just ORG
) in a row when the problem starts to happen.
Then, using a DNS client such as dig
or drill
, send a query for org
directly to the proxy, not to dnsmasq.
You'll have seen a truncated response. Having known that would have made the issue obvious.
And if you send the same query over and over again, even over TCP, you would have seen that the same truncated response was sent.
Mon Apr 1 10:49:33 BST 2019
@jedisct1
Thank you for the EXCELLENT
explanation. As shown in the initial issue above (in the expandable sections), I did perform all the tests (hence concluding it was dnscrypt-proxy
cache) - just not fully understanding the reasoning.
Cheers!
Tue Mar 26 10:41:57 GMT 2019
Setup:
Tests:
Logs:
dnscrypt-proxy.log:query.log: [dig www.cpan.org]
query.log: [dig -p 153 www.cpan.org]
query.log: [./dnscrypt-proxy -resolve www.cpan.org]
This problem started happening yesterday. At that time, I was using
v2.0.19
and upgraded tov2.0.21
.For the above tests, I disabled all IPv4 and IPv6 blacklists setup on the router (
ipset flush ...
).Tue Mar 26 13:12:18 GMT 2019
Before sending this, I rebooted router and all was well for 20 mins before
org
stopped resolving. I changed dns settings on laptop (macOS
) to use local dnscrypt [unbound:53, dnscrupt-proxy:50]. Same symptoms. restarting all dns services made no difference -tested withdnscrypt-proxy -resolve openbsd.org
.openbsd.org
I went through the config and turned caching off
cache = false
, restarted all dns services andorg
tld was once again resolving.working openbsd.org
Questions:
org
tld?cache = true
)?Thanks.