Closed rafaeldamasceno closed 9 months ago
Hey @ainar-g, sorry to be tagging you like this but I see a lot of issues have been triaged since this one was posted. Could you take a look? Thanks!
Hello and sorry for missing this earlier. Does the issue also persist if you use DoT?
@EugeneOne1, please inspect the bootstrap logic.
@rafaeldamasceno, we have a few guesses, but we'd still like to take a look at the verbose log. Could you please collect it and send it to devteam@adguard.com?
Also, do you have any entries in the container's /etc/hosts
file?
I've sent the log by email with the issue number in the subject. This scenario keeps occurring for me in case of power loss. My ISP router takes a lot more time to boot than the server/routers in which AdGuard Home is running.
For this test, my upstream DNS list was https://dns.google/dns-query https://dns.cloudflare.com/dns-query
and my bootstrap DNS were 8.8.8.8 1.1.1.1
. I have not changed anything in the container, including the hosts file. The only thing I have set up are volumes for the work and conf directories and open ports for DNS and the web interface.
Here's a timeline of what happens in the logs:
14:27:11 - started the container with no internet connection
14:29:01 - host reacquired internet connection (as evidenced by the stop of connect: no route to host
logs)
14:29:27 - I perform a dig test in the host with the results shown right under
$ dig @127.0.0.1 amazon.com
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out
; <<>> DiG 9.18.20 <<>> @127.0.0.1 amazon.com
; (1 server found)
;; global options: +cmd
;; no servers could be reached
$ dig @8.8.8.8 amazon.com
; <<>> DiG 9.18.20 <<>> @8.8.8.8 amazon.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32461
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;amazon.com. IN A
;; ANSWER SECTION:
amazon.com. 509 IN A 205.251.242.103
amazon.com. 509 IN A 52.94.236.248
amazon.com. 509 IN A 54.239.28.85
;; Query time: 37 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Tue Dec 19 14:29:35 WET 2023
;; MSG SIZE rcvd: 87
This test proves that AdGuard Home isn't responsive and the host does have internet connection. The web interface for AdGuard is responsive at all times. Additionally, I haven't been able to take the time to test it for DoT.
@rafaeldamasceno, we've received the logs, thank you. So far, I can tell that we should definitely implement some mechanism for updating the resolved upstream addresses, since the current one indeed only bootstraps the URLs until the first success. If you don't mind, we'd like to confirm the assumption by asking you to reapply the upstream configuration via the web UI (Settings → DNS Settings) instead of restarting the AdGuard Home. This should restart the bootstrapping.
However, I'm quite curious about the bootstrap results. Do you have any idea why do the bootstrap servers resolve the dns.google
into some kind of private address (10.0.0.1
)? Do they return the same address after AdGuard Home restart and if it's actually reachable?
Reapplying the upstream configuration does indeed work when the container has the connection reestablished. Testing this now has made me realize it isn't as immediate as I thought like with the host.
As far as why Docker is resolving these names with internal network addresses, I have absolutely no idea... What I can tell you is it doesn't happen if it has internet connection at first bootstrap:
2023/12/19 17:31:17.231061 1#14 [debug] parallel lookup: lookup for dns.google succeeded in 9.742255ms: [8.8.8.8 8.8.4.4 2001:4860:4860::8888 2001:4860:4860::8844]
2023/12/19 17:31:17.810968 1#74 [debug] parallel lookup: lookup for dns.cloudflare.com succeeded in 10.61866ms: [2606:4700::6810:84e5 2606:4700::6810:85e5 104.16.133.229 104.16.132.229]
I've also tested DoT and the exact same behavior occurs (it resolves the Docker internal network address, then doesn't resolve anything until restarted/bootstrap is triggered).
One additional thing I just tested was whether or not having fallback DNS would help with this issue: it doesn't. I've put the same regular DNS in the fallback and bootstrap lists and it still tries to resolve with the failing DoH/DoT servers.
My suggestion would be that instead of a update mechanism (which by all means sounds good and would also help with network disconnections), perhaps checking if the DoH/DoT servers are actually able to resolve stuff would be more important, both for the bootstrap and for using the fallback DNS.
I have a similar but not entirely the same problem as this one. I wonder if ADGUARDHOME can be implemented upstream, such as https://dns.google:443/dns-query When the TTL for the domain name dns.google is about to expire, I can resolve my issue by re resolving it
@rafaeldamasceno, hello again. We've finally implemented caching of the bootstrap results, so that now AdGuard Home respects TTLs of received upstream addresses. Could you please try the latest edge build and let us know if the situation improves there?
Hello @EugeneOne1, I've tested it for DoH and it seems to be working fine with the previous scenario :) As soon as internet connectivity is restored, it starts resolving domains again as well. Thanks for all the support.
@rafaeldamasceno, great to hear that. We'll close this for now then.
Prerequisites
[X] I have checked the Wiki and Discussions and found no answer
[X] I have searched other issues and found no duplicates
[X] I want to report a bug and not ask a question or ask for help
[X] I have set up AdGuard Home correctly and configured clients to use it. (Use the Discussions for help with installing and configuring clients.)
Platform (OS and CPU architecture)
Linux, AMD64 (aka x86_64)
Installation
Docker
Setup
On one machine
AdGuard Home version
v0.107.39
Action
After having the container started with no internet connection and acquiring it later on, DNS queries yield no result when upstream servers are DoH. Trying to get queries in any way didn't work. I tried both in browser as well as using dig locally. The AGH web interface worked just fine.
Expected result
After reacquiring a connection, DoH upstream servers are resolved and DNS queries were correctly answered to.
Actual result
DoH upstream servers keep not being resolved after reacquiring internet connection and DNS queries returned no result.
Additional information and/or screenshots
Restarting the container makes AGH work again. I have a simple compose file with the web interface and DNS ports open. The upstream servers are Cloudfare and Google's DoH servers and the bootstrap are both their primary DNS servers. If I set the upstream servers to the DNS servers, then everything works as soon as the connection is back. There should be a mechanism to check this and try to resolve DoH servers again when possible.
This is the log after one of the restarts. I don't have debug mode on, but I can try if really needed. I trimmed around 10k entries that were more of the same.