AdguardTeam / AdGuardHome

Network-wide ads & trackers blocking DNS server
https://adguard.com/adguard-home/overview.html
GNU General Public License v3.0
25.63k stars 1.84k forks source link

Local DNS zones and cached responses aren't served after the network lost #4825

Open EugeneOne1 opened 2 years ago

EugeneOne1 commented 2 years ago

Prerequisites

Operating system type

Linux, Other (please mention the version in the description)

CPU architecture

64-bit ARM

Installation

Docker

Setup

On one machine

AdGuard Home version

v0.107.9

Description

This is a continuation of the thread started in #2657. The problem's first occurance was in v0.104.3 and has already been fixed a couple of times but still reported. We can't reproduce the issue on our machines. If you've faced it, please consider providing the following information:

The last two pieces of information (optionally anonymized) could be sent to devteam@adguard.com with this issue's number in the subject.

EugeneOne1 commented 2 years ago

Please, take a look at this, @handcoding, @conradseba, @abdalians, @s1lviu, @dinosoup1. I've mentioned you since you've reported the issue to the #2657. Could you please also help us with the investigation? Thanks.

conradseba commented 2 years ago

Same issue here since ever. My setup is: Version: v0.108.0-b.11 Installed on PfSense 22.05, FreeBSD 12.3 (arm64) as a packet. I'm using DOH, my FW encapsulates all traffic through OpenVPN, no encryption facing internal networks enabled, no DHCP on the AdGuard and no IPv6.

I really hope this is solved soon, since I'm suffering from this many times a day everyday (my Vodafone provider is the worst I've ever had).

Thank you!!

abdalians commented 2 years ago

@EugeneOne1 we just need the debug logs, right?

abdalians commented 2 years ago

Same issue here since ever. My setup is: Version: v0.108.0-b.11 Installed on PfSense 22.05, FreeBSD 12.3 (arm64) as a packet. I'm using DOH, my FW encapsulates all traffic through OpenVPN, no encryption facing internal networks enabled, no DHCP on the AdGuard and no IPv6.

I really hope this is solved soon, since I'm suffering from this many times a day everyday (my Vodafone provider is the worst I've ever had).

Thank you!!

@conradseba if your wan drop frequency is that bad, could you please capture the logs as requested in the other ticket? Save me from taking down the network for log capture. :)

EugeneOne1 commented 2 years ago

@abdalians, that's right, we call it "verbose".

abdalians commented 2 years ago

Apologies for the delay in this I am finally in this broken state again and I am trying to collect as much Information as I can will post shortly.

abdalians commented 2 years ago

adguard_logs_02Sep2022.tar.gz

To reiterate the point, this only happens when my primary internet (cable) fails over to secondary internet (dsl)

Please see investigation file attached.

Until the time that the primary internet connection is restored, then enabling the Adguard PArental Control Web Service / Adguard borwsing securiy web services makes Adguard work again.

adguard_investigation.txt

handcoding commented 2 years ago

Please, take a look at this, @handcoding, @conradseba, @abdalians, @s1lviu, @dinosoup1. I've mentioned you since you've reported the issue to the #2657. Could you please also help us with the investigation? Thanks.

@EugeneOne1 I haven’t personally run into this issue since the fix for #4317 landed on the main trunk. (But that’s just me.)

kevindd992002 commented 2 years ago

Aha! I have the same issue and I posted about it just now:

https://github.com/AdguardTeam/AdGuardHome/discussions/4969

What is the progress for this? My unifi network uses the FQDN of my unifi controller. When my Internet connection drops (it just did two days ago and it was out for 45 freaking hours!), I lose control over my local network because of AGH!

abdalians commented 2 years ago

@EugeneOne1 do you need more information the ticket? still says needs investigation and needs to be reproduced reliably. I can reproduce this every single time without failure. Also the milestones were set to 107.16 which is out now.. does that mean we have a potential fix?

abdalians commented 2 years ago

Version: v0.107.16 still impacted by this.

ve6rah commented 1 year ago

Version: v0.107.17 still impacted by this.

nonoMain commented 1 year ago

Any updates on the matter? I stopped using it for now..

EugeneOne1 commented 1 year ago

@abdalians, hello again and apologies for late response. It actually seems AdGuard Home still serves local DNS zones, resolving the requests with appropriate local data, at least I can see some answered plain PTR requests for local addresses. All the other requests are indeed being dropped due to Safe Browsing services failure, even preventing those to be answered from cache. We have a feature request (#2857) about improving the implementation of the Safe Browsing / Parental Control services, but for now it terminates the request processing on failure.

Could you please check a few special cases:

AFAIK, AdGuard Home isn't responsible for any other local data in your setup (DHCP seems being disabled, and the only local resolver is loopback, so RDNS also has no additional info), so if the above is answered, the problem is Safe Browsing services reachability.

ve6rah commented 1 year ago

the problem is Safe Browsing services reachability.

I think I have to refute that, I don't use "safe browsing" on my setup, and yet, after my internet connection went down, I lost the ability to resolve local hosts. I'm talking specifically about hosts in the DNS rewrites section of my config.

I was quite surprised that running my own DNS I would lose the ability to resolve hosts on my own internal network!

EugeneOne1 commented 1 year ago

@ve6rah, that is weird if the local network is ok. Are you able to reproduce it? If yes, could you please also capture a verbose log for us? This would be really helpful since we still can't reproduce it on our machines.

namob commented 1 year ago

I noticed the same thing and the issue seems to be if "Use AdGuard browsing security web service" is enabled or not. I recreated this by blocking the internet for one of my adguard VMs. With "Use AdGuard browsing security web service" enabled, local lookups are not performed, when I disabled it everything works without a problem.

Attached is the verbose log file when "Use AdGuard browsing security web service" is enabled. adgh-browsing_security_enabled.log

abdalians commented 1 year ago

@abdalians, hello again and apologies for late response. It actually seems AdGuard Home still serves local DNS zones, resolving the requests with appropriate local data, at least I can see some answered plain PTR requests for local addresses. All the other requests are indeed being dropped due to Safe Browsing services failure, even preventing those to be answered from cache. We have a feature request (#2857) about improving the implementation of the Safe Browsing / Parental Control services, but for now it terminates the request processing on failure.

Could you please check a few special cases:

  • Add a $dnsrewrite entry with some improbable domain name to your custom filtering rules, something like:

    ||not-a-real.domain^$dnsrewrite=NOERROR;A;1.2.3.4

    And after the network lost try to request it. Should be resolved properly regardless of the Safe Browsing services state;

  • Try to request some domain from the /etc/hosts file, they should be resolved as well.

AFAIK, AdGuard Home isn't responsible for any other local data in your setup (DHCP seems being disabled, and the only local resolver is loopback, so RDNS also has no additional info), so if the above is answered, the problem is Safe Browsing services reachability.

@EugeneOne1 : I have my own local domain.com being served by BIND, inside the local network, and since Adguard home is the primary resolver for all dns clients in the network, I had a rule to send domain.com to BIND dns server.

[/domain.com/]192.168.10.5 (https://github.com/AdguardTeam/AdGuardHome/wiki/Configuration#upstreams-for-domains);

When the internet drops (fails over to the secondary Internet connection), Adguard simply stops responding to any dns queries. Even the local BIND name resolution seizes to function.

I do have a workaround implemented for this now: BIND: Listening on 127.0.0.1 Adguard: Listening on lan IP (192.168.10.5 in my case) For ALL DNS requests, I point adguard to 127.0.0.1 as upstream.

image

and then from Bind Upstream I have my chosen Upstream DNS providers.

** The asterisks here in my setup is I have dual WAN, so while my internet is actually not down, just failed over to my secondary, Adguard home refuses to resolve anything including the local domains.

sammyke007 commented 1 year ago

Still an issue... New Adguard Home user and as soon as WAN goes down, none of the DNS rewrites work anymore.

Nslookup shows the rewrite is working, as long as WAN is up.

fuomag9 commented 11 months ago

Still happening to me as well

fuomag9 commented 11 months ago

I noticed the same thing and the issue seems to be if "Use AdGuard browsing security web service" is enabled or not. I recreated this by blocking the internet for one of my adguard VMs. With "Use AdGuard browsing security web service" enabled, local lookups are not performed, when I disabled it everything works without a problem.

Attached is the verbose log file when "Use AdGuard browsing security web service" is enabled. adgh-browsing_security_enabled.log

In my case they were all disabled

image

EugeneOne1 commented 11 months ago

@abdalians, @sammyke007, @fuomag9, @james-1987, could you please capture the verbose log for us? Unfortunately, we still can't reproduce it. It would also be helpful to look at the exact moment the network went down, if that can be done manually. Note that safe browsing and parental control features should be disabled, as it actually breaks the resolution under these circumstances.

The logs could be sent to devteam@adguard.com.

sammyke007 commented 11 months ago

For me it was fixed by using Unbound as upstream DNS for my internal network:

Upstream DNS settings: https://dns10.quad9.net/dns-query [/in-addr.arpa/]192.168.1.1:5553 [/ip6.arpa/]192.168.1.1:5553 [/localdom/]192.168.1.1:5553

and Private reverse DNS servers: 192.168.1.1:5553

themanbornwithin commented 11 months ago

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

Palleri commented 1 month ago

Still a problem Version: v0.107.52

fuomag9 commented 1 month ago

Still a problem Version: v0.107.52

Can confirm as well, even the suggested fixes do not work for me

blakeusblade commented 3 weeks ago

Still a problem Version: v0.107.52

OS Type: GLi-Net 4.6.8 / LuCI openwrt-21.02 Hardware: GL-MT6000 Flint2 CPU: ARM AdGuard Home Version: v0.107.52

Can confirm as well... Issue arrose after upgrading to v0.107.52.

Turning off AdGaurd restores local lan name resolution, and turning it back on again breaks it.

GentleHoneyLover commented 1 day ago

This workaround doesn't work for me on v0.107.54 (running in Docker on x86 machine):

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

RedFoxy commented 1 day ago

This workaround doesn't work for me on v0.107.54 (running in Docker on x86 machine):

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

This is my configuration and it goes when the connection goes offline, I've 192.168.0.x LAN and my router 192.168.0.1 give me static dns for LAN device

https://github.com/RedFoxy/HA-MyConf/blob/main/AdGuardHome/AdGuardHome.yaml

I hope that can help you

kevindd992002 commented 1 day ago

This workaround doesn't work for me on v0.107.54 (running in Docker on x86 machine):

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

This is my configuration and it goes when the connection goes offline, I've 192.168.0.x LAN and my router 192.168.0.1 give me static dns for LAN device

https://github.com/RedFoxy/HA-MyConf/blob/main/AdGuardHome/AdGuardHome.yaml

I hope that can help you

What?

RedFoxy commented 22 hours ago

This workaround doesn't work for me on v0.107.54 (running in Docker on x86 machine):

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

This is my configuration and it goes when the connection goes offline, I've 192.168.0.x LAN and my router 192.168.0.1 give me static dns for LAN device https://github.com/RedFoxy/HA-MyConf/blob/main/AdGuardHome/AdGuardHome.yaml I hope that can help you

What?

With my configuration I can access the DNS of the local names provided by the gateway and ADGuard's DNS cache, also when the internet becomes available again ADGuard comes back to work completely without any problems

kevindd992002 commented 20 hours ago

This workaround doesn't work for me on v0.107.54 (running in Docker on x86 machine):

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

This is my configuration and it goes when the connection goes offline, I've 192.168.0.x LAN and my router 192.168.0.1 give me static dns for LAN device https://github.com/RedFoxy/HA-MyConf/blob/main/AdGuardHome/AdGuardHome.yaml I hope that can help you

What?

With my configuration I can access the DNS of the local names provided by the gateway and ADGuard's DNS cache, also when the internet becomes available again ADGuard comes back to work completely without any problems

Right. So do you know which specific setting in your config is fixing this?

RedFoxy commented 6 hours ago

Right. So do you know which specific setting in your config is fixing this?

Simply in my configuration I planned to have a separate DNS server for the LAN, in my case it is a service provided by my gateway with ip 192.168.0.1, that would act as a CACHE and DNS for the local static DNS, so in “Settings -> DNS settings” under “Upstream DNS servers” I added the rules for which the gateway DNS server should be used:

[/.local/]192.168.0.1 [/.mydomain.com/]192.168.0.1

Basically for all DNS requests that end in .local or .mydomain.com instead of ADGuard responding the DNS server 192.168.0.1 is queried

On the same page I activated the item “Use private reverse DNS resolvers”.

After that under “Filters -> Custom filtering rules,” just in case, I added the local domains not to be blocked:

@@||local^ @@||eth.local^ @@||wifi.local^ @@||mydomain.com^

Doing so solved the problem of internet drops and AdGuard Home not responding once the internet came back

ve6rah commented 4 hours ago

While I suppose that is a workaround, it also doesn't make any sense. Adguard home is supposed to function as a caching DNS server. What you've done is add another DNS server to your network. What this issue is about is the fact that you shouldn't need another DNS server because adguard home should fill that role.

kevindd992002 commented 4 hours ago

While I suppose that is a workaround, it also doesn't make any sense. Adguard home is supposed to function as a caching DNS server. What you've done is add another DNS server to your network. What this issue is about is the fact that you shouldn't need another DNS server because adguard home should fill that role.

But his configuration is expected if your upstream device is a firewall router like pfsense/opnsense. These have unbound in them and you point AGH to that as it is also the DHCP server of the network. This is a supported config.

@RedFoxy , I have the same config as you do, at least for the DNS servers part. But why do you still have .local there? I only have my local domain listed there.

Also, why the need to put them in the whitelist? This part is what I don't have.

ve6rah commented 4 hours ago

I have to strongly disagree with this. If you are running a DNS server on your router, then you should be doing ad filtering at that level as well. Adding extra DNS servers along the way. Just slows down DNS lookups and adds extra points of failure and extra complication.

kevindd992002 commented 4 hours ago

I have to strongly disagree with this. If you are running a DNS server on your router, then you should be doing ad filtering at that level as well. Adding extra DNS servers along the way. Just slows down DNS lookups and adds extra points of failure and extra complication.

I get your point but that latency is negligible fora home network. To be fair, I have AGH installed on my opnsense router itself and is pointed to itself (localhost), and I get an average processing time of 6ms.

ve6rah commented 4 hours ago

But you were still adding ridiculous unnecessary extra complication. Just point the adguard instance at the real upstream DNS server, instead of adding another one in the middle. If you're trying to pretend that this is a workaround for this bug. You might as well say just don't use adguard. Because the whole point to this bug is that adguard doesn't work if it can't access an upstream server. When you add an extra upstream server within your own house, all you've done is move the bug. One more layer. You'll still have the exact same issue if that server within your house goes down.

RedFoxy commented 2 hours ago

@RedFoxy , I have the same config as you do, at least for the DNS servers part. But why do you still have .local there? I only have my local domain listed there.

Also, why the need to put them in the whitelist? This part is what I don't have.

Maybe I confused you for a moment, DHCP is handled by my gateway (a Mikrotik) and among other services it provides, it also has a DNS server for all my local .local names like pve.local or frigate.pve. local, but also to override the names of my external domain mydomain.com, this is because if I go to frigate.mydomain.com with my cell phone and I am connected to WiFi at home he resolves it as 192.168.0.10 if I am away from home he resolves it with my external ip.

In the local network I do NOT directly use any DNS server other than AdGuard Home, while ADG uses my gateway as upstream, so when I ask frigate.pve.local to resolve me I ask ADG which in turn asks the gateway.

Why do I do this?

I realized that ADG when it does not reach the external DNS it crashes and does not always come back to work when the external DNS becomes available again, if I provide it with an always working DNS, such as my gateway's DNS, it never crashes and always resolves my local DNS and the ones it has cached.

RedFoxy commented 2 hours ago

But you were still adding ridiculous unnecessary extra complication. Just point the adguard instance at the real upstream DNS server, instead of adding another one in the middle. If you're trying to pretend that this is a workaround for this bug. You might as well say just don't use adguard. Because the whole point to this bug is that adguard doesn't work if it can't access an upstream server. When you add an extra upstream server within your own house, all you've done is move the bug. One more layer. You'll still have the exact same issue if that server within your house goes down.

I completely understand what you mean, but unfortunately I have an unstable line and internet drops easily every time it rains, even for a few seconds, the fact that ADG would always crash forcing me to restart its service in order to surf take advantage of the network again, I preferred to use this system, I don't feel all this lag in the resolution and since I use it I don't have network problems anymore, while before I was very tempted to uninstall ADG

ve6rah commented 1 hour ago

But then why are you commenting on this bug, if your whole point is to just not use adguard because your line is unstable? This bug is an attempt to get the fact that adguard goes down when your line does fixed! Telling us to just skip adguard to solve the problem does not add anything to the conversation about fixing the bug in adguard in the first place.

RedFoxy commented 38 minutes ago

But then why are you commenting on this bug, if your whole point is to just not use adguard because your line is unstable? This bug is an attempt to get the fact that adguard goes down when your line does fixed! Telling us to just skip adguard to solve the problem does not add anything to the conversation about fixing the bug in adguard in the first place.

I use EVERYTIME ADGuard! why do you say that I don't use ADG? The trouble is when the internet goes offline and ADG doesn't goes! but with that workaround you'll continue to use ADG when you are offline or when you switch from land cable to mobile hotspot etc...

ve6rah commented 36 minutes ago

No, you specifically stated that you do not use adguard, you only use adguard as a relay to your other DNS server. This bug is about those of us who are trying to use adguard as a DNS server.

RedFoxy commented 33 minutes ago

No, you specifically stated that you do not use adguard, you only use adguard as a relay to your other DNS server. This bug is about those of us who are trying to use adguard as a DNS server.

Where did I say this? I said I use ADGuard as my only DNS, the upstream DNS that ADGuard uses when it doesn't know what to resolve is my gateway where google and cloudflare DNS are also set, but all my queries go through ADG first!

Devices -> DNS REquest -> ADG -> Gateway -> Other external DNS

ve6rah commented 32 minutes ago

Exactly, you are not using adguard as a DNS, you are using adguard simply as a relay to your real DNS server that is on your router. Some of us are trying to use adguard as a DNS server, not simply as a relay. And that's who this bug is for. People who want to use adguard as a DNS server.

RedFoxy commented 13 minutes ago

Exactly, you are not using adguard as a DNS, you are using adguard simply as a relay to your real DNS server that is on your router. Some of us are trying to use adguard as a DNS server, not simply as a relay. And that's who this bug is for. People who want to use adguard as a DNS server.

excuse me, my wronge, but ADG doesn't need an external dns to resolve names?

ve6rah commented 10 minutes ago

The key word here is external. And external depends on the internet. We are looking to allow adguard to serve local DNS without an internet connection. That's the whole bug. Adguard currently refuses to serve local DNS when it cannot access an external DNS server. That's wrong. It should be able to do so.

This whole thing doesn't apply to your situation because you don't have adguard pointing to an external DNS. And you don't have it trying to serve local DNS at all, you have your other DNS server serving local DNS.

Adguard is a DNS server, and as such should not require an upstream resolver to resolve addresses it already knows, in this case local ones.

RedFoxy commented 4 minutes ago

The key word here is external. And external depends on the internet. We are looking to allow adguard to serve local DNS without an internet connection. That's the whole bug. Adguard currently refuses to serve local DNS when it cannot access an external DNS server. That's wrong. It should be able to do so.

This whole thing doesn't apply to your situation because you don't have adguard pointing to an external DNS. And you don't have it trying to serve local DNS at all, you have your other DNS server serving local DNS.

Adguard is a DNS server, and as such should not require an upstream resolver to resolve addresses it already knows, in this case local ones.

I used to have local dns on ADG but the fact that it doesn't work when the internet goes down was blocking me too much.

I'm sorry to have bothered you