AdguardTeam / AdGuardHome

Network-wide ads & trackers blocking DNS server
https://adguard.com/adguard-home/overview.html
GNU General Public License v3.0
25.58k stars 1.84k forks source link

No local DNS resolution & Froze UI when no internet connection #6920

Open TheFou opened 7 months ago

TheFou commented 7 months ago

Prerequisites

Platform (OS and CPU architecture)

Darwin (aka macOS), ARM64

Installation

Docker

Setup

Other (please mention in the description)

AdGuard Home version

0.107.46

Action

Hi,

I had an internet connection failure today, probably some maintenance at my ISP. While this happened, AGH behaved really erratically.

Let me explain : My home mimics an enterprise setup, with an Active Directory, and DNS integrated zones. AGH is set up as my main DNS on all clients, and is is set up to send all queries related to my local DNS to the domain controllers, while handling all internet DNS. It always worked like a charm, for many years. However, while disconnected today, it didn't handled anything anymore, not even the local domains. I kept having "...i/o timeout" on every line in the log.

At first, it was behaving so buggy that I thought it was the culprit for my connection issues... I tried changing upstream DNS, and adding a failover, and each time I tried to apply the UI froze completely, without updating the settings.

Expected result

At least to forward local domains / PTR to the AD controllers

Actual result

See above. Everything turned back to normal as soon as my connection came back up.

Additional information and/or screenshots

AGH v0.107.46 installed as a docker container on an Ubuntu 22.04.4 LTS PFSense router. Active Directory with DNS integrated zones (no forwarding), and DHCP handled by Domain Controllers.

matth0727 commented 7 months ago

I can also confirm this is still an issue... All of my DNS Rewrites in AGH did not resolve when the Internet was out today.

I've reviewed a lot of these posts and it seems like the Author has a tough time re-producing the issue... the easiest way I've found is to just set the Upstream DNS Servers in AGH to something like 1.2.3.4 (not DNS resolvable) and this allows you to see that DNS Rewrites or internal look ups start to fail.

It seems like the only ones that work (somewhat) are the ones that are cached; which is easily tested by adding a new DNS Rewrite and the issue will continue, with the lookup failing to query AGH properly, while the "Internet is out".

While I know this doesn't resolve your issue @TheFou I hope it helps the author to dig a bit further and find a resolution to this for all of us.

TheFou commented 7 months ago

I posted because it caused some chaos today, I rely heavily — maybe too much — on local DNS for all my different services UI, so it made it difficult to even diagnose the issue. I hope I don't have another outage anytime soon.

The thing that really bothered me is that I wasted a lot of time because, as some connections were still working (cached ones probably), my router was still showing WAN as up, and the UI behaved so buggy, I thought AGH was the culprit. And of course I couldn't figure out anything wrong, as it was not.

Anyway, thanks for the details you added @matth0727, every little bit counts. I'll try with a dummy address, to check how I could improve things on my network in case of outage.

To anyone in AdguardTeam : please tell us how we can help to resolve the issue, if you need specific logs or anything. I didn't include any at this time as I didn't think to record them at the time, but if I can help with some tests, please tell me. And thanks for your amazing work, I forgot in the first post 😉 Regards.

TheFou commented 4 months ago

Hi, Another outage yesterday... and still the same issue. I see there has been no reaction to this issue so far. Does anyone at AdGuard care ? I can help for tests if needed, just ask.

maretodoric commented 4 months ago

I can configure the same behavior on my end. It's quite annoying considering I have many other services running in LAN requiring DNS to work.

UDP connection to Adguard port 53 is working when testing via netcat. But when I issue dig against it, it's timing out. And I've even set timeout to 60 seconds, no reply within the timeout.

Taverius commented 3 months ago

Hit this today, wow its annoying. My whole network ground to a halt.

Taverius commented 3 months ago

Further research shows multiple issues regarding this, and I found 2 solutions in #4317:

The reason is the timeout on the request to the AdGuard service is longer than the DNS timeout, so all non-cached, not explicitly allowed requests get timed out.

I wonder if it would be a good idea to add the allow rule thing to the configuration wiki where it talks about local domains, so we don't send our internal domain requests to the AG service for a useless check.

Zerorigin commented 2 months ago

Further research shows multiple issues regarding this, and I found 2 solutions in #4317:

  • Disable safe browsing, either:

    • By setting safebrowsing_enabled to false in the YAML
    • Uncheck "Use AdGuard browsing security service" under Settings -> General Settings.
  • Make an allow rule for your LAN domain, for example:

    • @@||*.lan^$important in the custom filtering rules.

The reason is the timeout on the request to the AdGuard service is longer than the DNS timeout, so all non-cached, not explicitly allowed requests get timed out.

I wonder if it would be a good idea to add the allow rule thing to the configuration wiki where it talks about local domains, so we don't send our internal domain requests to the AG service for a useless check.

In some specific environments, AdGuard's service interfaces are blocked, which can also lead to this issue, so we need some alternate solutions to this problem.