AdguardTeam / AdGuardHome

Network-wide ads & trackers blocking DNS server
https://adguard.com/adguard-home/overview.html
GNU General Public License v3.0
25.61k stars 1.84k forks source link

Run 2 Instances Of AdGuard Home For Redundancy #573

Open rashidjehangir opened 5 years ago

rashidjehangir commented 5 years ago

Hi Guys, is it possible to run 2 instances of AdGuard Home on different PC's on the network for redundancy? I Can use the 2 DNS addresses in the router.

ameshkov commented 5 years ago

Well, usually it is possible to configure multiple DNS servers in the router DHCP settings. However, one of them will be the primary server, and it will be used most of the time anyway.

RCourtenay commented 5 years ago

Think this should be reopened and made a feature request. Routers don't strictly have a primary and secondary DNS in the sense that ones a backup only server, it's just poor terminology that gets used around the place and whether a router favours one DNS server over another is up to implementation and vendor. Generally if multiple DNS servers are configured, they will both be used. Even if one was used more frequently than another, that leaves a percentage of requests going to a different DNS server and if thats not an Adguard server the security measures provided by this product are null and void for those requests.

There's various reasons to want to want to run multiple instances, with the main one being people simply may need to reboot a system from time to time. Either the router will have one DNS server configured which is expected to go down occasionally, or theres multiple and traffic 24/7 will be going to both. Ideally rather than the other DNS server being a public one, it could be another Adguard instances so irrespective of what server is used, the traffic is protected.

Right now thats possible no doubt, but any whitelist, blacklist etc etc will need to be manually applied to two servers. It'd be a great advantage if the web UIs allowed for synchronisation of config which would make managing multiple instances substantially easier and encourage more secure measures. FWIW this is the most upvoted, not implemented, feature over at Pi-Hole so there is demand and I suspect implementing it would win many a customer over.

ameshkov commented 5 years ago

@iSmigit well, there's nothing that prevents running multiple instances of AGH as long as they have different configurations / listen ports.

RCourtenay commented 5 years ago

Thanks for the response. The subject title and my assumption is that the OP and myself would both like to see identically configures instances running on two systems. If one instance was to go down the endpoint or router could direct traffic to the other DNS server, which are both identically configured.

As you note this is already doable, but if you have config that’s not out of the box then you need to manually apply it to both instances. In particular adding filters means managing multiple instances where I think it would be awesome if the application itself could accept the host name of a second instance of the application and have them replicate the system config (filter rules) whenever a change is made to one instance.

Unsure if it’s work well if the DHCP service was enabled, but for pure DNS filtering I think being able to mirror config automatically would be awesome.

ameshkov commented 5 years ago

I just think that this is a bit too complicated for the current stage of AGH development.

If we're talking about running two different instances of AGH on two different machines, it should be possible to simply rsync the config file. The problem is that the sync operation requires restarting the service:

stop AGH
rsync AdGuardHome.yaml
start AGH

Also, I think we can do it step by step and start with smaller things. For instance, we could start with implementing "reload" and "configtest" operations (like what nginx provides). With these operations it'll be easier to implement settings sync via rsync.

onedr0p commented 5 years ago

I would like this feature as well, any possibility of opening this issue back up?

ameshkov commented 5 years ago

Reopened as a feature request, and issue priority set to "low" for now.

If you upvote this feature request, please also add a comment explaining your use case.

onedr0p commented 5 years ago

The important issue to solve would be to allow multiple instances of AdGuardHome to be deployed and synced with each other, for example a Primary, Secondary instance. My use case is to make AdGuardHome to be highly available (HA), thus providing near-zero downtime to DNS requests when my main AdGuardHome instance goes offline. There are many different factors that would allow AdGuardHome to go offline like rebooting for patches or hardware failure.

A stop-gap would be to use NGiNX, haproxy, keepalived or any other TCP/UPD load balancer with the combination of rsync / scripts. Having it be provided out of the box in AGH would be awesome.

It's the top requested feature of the PiHole project: https://discourse.pi-hole.net/c/feature-requests/l/top?order=votes

Edit: For those finding this issue. I have moved to Blocky. It is completely stateless and can be ran HA with minimal configuration.

jschwalbe commented 4 years ago

@ameshkov

If you upvote this feature request, please also add a comment explaining your use case.

Can you tell me how to upvote it? I would really love this as well. One server goes down, wife says "The wifi is down!! What did you do?" if I have two servers which are synchronized, that buys me some WAF points.

onedr0p commented 4 years ago

@jschwalbe and others in this issue. I would check out Blocky. It is completely stateless and can be ran HA with minimal configuration.

ameshkov commented 4 years ago

@jschwalbe just ad a 👍 reaction to the issue

subdavis commented 4 years ago

Goal

First, let's clearly establish the purpose of this feature. It's not about load balancing -- if you want load balancing, use HAProxy. It should be communicated that load balancing is completely out-of-scope for this project. This issue is about config synchronization, which is a blocker to running multiple instances of AdGuard for High Availability.

I'd like to explore a potential solution which I think would give AdGuard a edge against competitors: webhooks. I'm interested in contributing here because I think AdGuard's code quality is leagues ahead of PiHole. I also want to stress that I don't think this issue has been adequately labeled: based on dozens of attempts by users on reddit and pihole forums and hundreds of participants on those threads, a lot of people want this. AdGuard could be the first to have it. (Note: Blocky referenced above does not have this feature, it just makes it easier to deploy 2 duplicate instances based on stateless config -- whole different ballgame)

My Proposal

This is the simplest and most robust solution I could come up with.

Enumerate a set of hookable events, for example:

const (
    DnsConfig       EventType = "dns_config"
    DnsRewrite      EventType = "dns_rewrite"
    DnsSafeBrowsing EventType = "dns_safe_browsing"
    DnsAccess       EventType = "dns_access"
    DnsParental     EventType = "dns_parental"
    DnsSafeSearch   EventType = "dns_safe_search"
    BlockedServices EventType = "blocked_services"
    Dhcp            EventType = "dpcp"
    Stats           EventType = "stats"
    Private         EventType = "private"
    QueryLog        EventType = "query_log"
    Filter          EventType = "filter"
    FilterRule      EventType = "filter_rule"
    I18N            EventType = "i19n"
    Client          EventType = "client"
    Tls             EventType = "tls"
)

Then, rather than baking the config sync service into AdGuardHome, we could write another small microservice that waits for webhooks and pulls config from the primary, then pushes it to all secondary nodes. The microservice could even be very clever and do bi-directional sync if it could diff the changes. It may actually be better to eventually put this into the core, but webhooks would be a good first step.

1649 is a draft PR of my plan of attack. If approved, I can also draft up and example sync service. If we would prefer to put the sync service into this codebase, I have ideas for that too.

ameshkov commented 4 years ago

@subdavis let's discuss the proposal here. Otherwise, the discussion will be fragmented.

First of all, thank you for your insights. For some reason, I didn't realize that this feature is that desired. We should definitely find a way to help users configure it.

Regarding webhooks, I generally like the idea of providing them as a feature by itself, it will definitely help people build projects integrated with AdGuard Home, and it will be a great addition to the API. However, I don't think webhooks should be purposefully limited to config changes. The most popular feature requests in this repo are about query logs and statistics, and I suppose webhooks should support them.

About the implementation draft, I think that synchronizing configuration should not require that many different webhook event categories. I kinda understand why you did it -- I guess it is to use existing API methods. However, I think this complicates things, and there is a simpler way.

Alternatively, there can be a single "config changed" event that is fired every time config file changes. The webhook consumer may react to this event, replace the config file on the secondary server, and simply reload AdGuard Home configuration (config reload was implemented in #1302). To make this all easier to implement, we may add "reload" as a separate API method.

Please let me know what you think.

subdavis commented 4 years ago

@ameshkov thanks for changing the priority of this issue.

However, I don't think webhooks should be purposefully limited to config changes.

Sure, expanding the scope to include metrics reporting makes sense. I haven't looked at those feature requests, but I can.

About the implementation draft, I think that synchronizing configuration should not require that many different webhook event categories.... there can be a single "config changed" event...

I did this mainly because it allowed a very clean mapping between webhook event name and which API endpoint the webhook handler needs to call to fetch and propagate configuration to "followers".

Here's a example:

If all the handler gets is config_changed, it has to fire dozens of API queries on both the primary and secondary servers because it doesn't know what changed. It has to sync filtering rules and DHCP settings and everything, even though only one thing changed. IMO, that's unnecessarily heavy and slow.

The webhook consumer may react to this event, replace the config file on the secondary server

There is no API for fetching a server's entire config, but we could add one. Here's the example I think you're suggesting:

This raises a few questions for me:

I don't believe mixing filesystem operations, signals, and webhooks is a very good practice.

Better to just do the sync purely with REST, I think, since the concepts involved are easier and more accessible to the average user.

ameshkov commented 4 years ago

I'll try to come up with a more detailed description of what I suggest tomorrow.

Meanwhile, a couple of quick thoughts:

  1. Exposing methods that work with config file via REST API is not a problem, something like /control/sync/get_config and /control/sync/set_config for instance.
  2. These methods are better to be independent of the other strong-typed structs, maybe even placed to a separate package (sync?). Thus we'll guarantee that we won't need to change them regardless of what changes we make to the config structure.
  3. We need a separate method that fully reloads AdGuard Home using the new configuration. This one is a bit tricky. I thought we did implement it, but as I see we only have partial reload now. @szolin plz assist.

This would address most of your questions save for one:

What if I don't want to replace the whole file? maybe my secondary servers should have different passwords.

I suppose this can be handled on the side of the sync micro-service. It's rather easy to exclude specified yaml fields from sync.

subdavis commented 4 years ago

Exposing methods that work with config file via REST API is not a problem

Sure, that seems fine. A bit heavy-handed, perhaps, but definitely less complex for maintainers of both systems. It may perform badly for users with large filter lists, so that might be worth evaluating.

These methods are better to be independent of the other strong-typed structs... guarantee that we won't need to change them...

If you do the whole /control/sync/get|set thing, it may make sense to just write the handler in Go and import the Config struct from this package, so you get strong typing for free.

If API consumers use a different language, though, their code is just going to break silently when the schema changes. This is why I really like Openapi-codegen -- the compiler yells at you when the schema changes and breaks your consumer.

Anyway, thanks for the consideration and the discussion. I'm happy to help work on this PR, and I'm planning to write the sync handler. I don't care if that lives in my own personal namespace or this organization.

Cheers.

ameshkov commented 4 years ago

If you do the whole /control/sync/get|set thing, it may make sense to just write the handler in Go and import the Config struct from this package, so you get strong typing for free.

I guess using configuration struct is okay, we would need to use it in any case.

Openapi-codegen -- the compiler yells at you when the schema changes and breaks your consumer.

Well, this is one of the reasons why we're maintaining the openapi spec. We do that manually, though.

szolin commented 4 years ago

We need a separate method that fully reloads AdGuard Home using the new configuration. This one is a bit tricky. I thought we did implement it, but as I see we only have partial reload now.

We don't have a configuration reload now - currently there's no use for it. In general, we still need to finish splitting TLS, Web, filter.go modules out from Core (package home). But apart from it, there are no problems to add a Reload() method to each running module. In fact, TLS module already supports it.

ameshkov commented 4 years ago

@szolin got it, thx.

yuha0 commented 4 years ago

I kind of achieved this and my AdGuard Home setup is running in HA mode. Technically (if I have more servers at home), I can run 100 instances and load balance DNS queries easily, and configuration changes has zero downtime.

However, I did lose some functionalities. I described my setup here.

Btw, having HA was one of the reasons that I made this feature request.

jeremygaither commented 4 years ago

My use case is simple: I'd like to run two instances in parallel on separate hardware, and somehow synchronize the config and logs. If it can forward logs to a third shared data store, that's a big plus.

mattbruman commented 4 years ago

Use case is with some hardware especially some android devices Google or OnePlus has hard coded a secondary DNS as google. Now I have black holed 8.8.8.8 but I get random lag loading sites because the phone sometimes tries to resolve to google anyways if my Adguard DNS sometimes lags a bit. And the only way around this is either switch to static on everything which cant be done on some devices or have a secondary that is different on my router. Setting the same secondary on Unifi does not solve the issue. So now I have two AG running but I have a bunch of filters set for the kids that I sometimes have to disable for their school like youtube. And having to deal with disabling two instances of AG is a hassle, I would rather have the ability to make changes on one reflect on the other, I do have it set in home assistant when i disable one it disables the other but I cant do that with clients when I say disable youtube slider to do the same on the other.

abstractvector commented 3 years ago

@mattbruman I’m not sure what firewall / router you’re using, but rather than black-holing 8.8.8.8 you could instead route it back to AdGuard. I’ve done this using the NAT / Port Forwarding feature in OPNsense - all outbound traffic to port 53 gets rerouted to my AdGuard Home server. So several of my devices think they’re getting DNS resolution from 8.8.8.8 but really it’s from AdGuard Home. I haven’t yet needed to set this up for port 853 (DoT) and I don’t know how I’d handle DoH, but it’s working great so far!

PSSGCSim commented 3 years ago

@mattbruman In my experience putting one IP address two times in the DHCP config solves this issue with Android. If that does not work for you, you can simply create secondary virtual IP address on your router and bind AGH to both.

Of course this is a workaround and not a proper HA setup.

Additionally I ICMP reject all requests to common DNS servers (everything including DNS/DoT/DoH).

jeremygaither commented 3 years ago

@abstractvector DoT and DoH are very unlikely to work, unless you can install your own CA on all of the devices using AGH, and set up a false certificate for whatever address the device is trying to use (may or may not be Google DNS). Blocking outbound port 853, except from the AdGuard home instance(s), might force the apps/devices to downgrade to normal DNS over port 53 (which can be redirected as mentioned above depending on your router/firewall).

@mattbruman you may be able to use the above along with a second IP address allocated on the AdGuard Home server, so the devices get two unique addresses, yet they point to the same instance. If you are not concerned about downtime, and are only concerned with intermittent lag issues, then you probably don't need HA. But real HA would solve the problem as well, keeping two instances in sync and both could answer queries separately. Perhaps your devices would fall back to the secondary DNS instead of a hard-coded DNS server. Or if you can repro the issue while running wireshark, you may be able to determine what hard-coded resolvers it is attempting to use.

Using rsync to replicate the config file is an interesting workaround (that I may try), but each instance would hold its own metrics/logs. Merging metrics/logs from all instances is also a desirable HA feature. High Availability is probably a high-level epic with multiple tasks, the first being some way for instances to communicate, similar to pfsync from pfsense/opnsense. Something like ZeroMQ might be useful, or might be too much, versus a REST-based webhook approach.

tomlawesome commented 3 years ago

I would really like the ability to integrate 2+ instances of AdGuard home for redudancy. It's annoying having to run two manually administered setups, each with separate stats. For the block/allow lists at least, it would be good if they could either sync to, or be pulled from a centralised resource, be that a 'primary' instance of AdGuard, or an arbitrary storage location on a server.

It would be great if the stats could be combined to one pool for analysis (whilst maintaining segregation of data by instance).

mzac commented 3 years ago

If anyone is wanting to setup Adguard on a Kubernetes (k8s/k3s) cluster, I got it working on my Raspberry Pi cluster here at home. I have 3 nodes in my cluster that are running adguard as a daemonset (so they all run one copy). For the incomming traffic, I am using MetalLB as a load balancer to publish IPs on my network (192.168.0.42 & 192.168.0.43). In my config below I have also included a service that could be used if you are using keepalived between nodes which is what I was doing before I started using MetalLb.

The only gotcha is that all the nodes will run their own adguard with their own logs, so until we could get centralized logging to work you would have to look at each node to see if it is receiving traffic (maybe with a nodeport). Another idea is if Adguard supports debugging logging, then kubernetes could spew out the console log to a central log collector.

In order to use MetalLb with k3s, you need to disable the default load balancer that comes with it. I am using k3sup to provision my cluster with the following config and then deploy MetalLb and Traefik as my ingress controller:

export K3S_VERSION="v1.19.5+k3s2"
# Install Master Node - vm1
k3sup install \
    --cluster \
    --ip $MASTER_VM1 \
    --user $SSH_USER \
    --k3s-version $K3S_VERSION \
    --k3s-extra-args '--node-taint key=value:NoSchedule --no-deploy traefik --disable servicelb'

If you don't know much about Kubernetes here is a good starting point: https://blog.alexellis.io/test-drive-k3s-on-raspberry-pi/

Hope this can help someone out there! Good luck!

# --------------------------------------------------------------------------------
# Namespace
---
apiVersion: v1
kind: Namespace
metadata:
  name: adguard

# --------------------------------------------------------------------------------
# Startup Script
---
apiVersion: v1
kind: ConfigMap
metadata:
    name: adguard-init
    namespace: adguard
data:
    adguard-init.sh: |
      #!/bin/sh

      mkdir -p /opt/adguardhome/conf
      cp /tmp/AdGuardHome.yaml /opt/adguardhome/conf
      ls -al /opt/adguardhome/conf/AdGuardHome.yaml

# --------------------------------------------------------------------------------
# Config for Adguard
---
apiVersion: v1
kind: ConfigMap
metadata:
    name: adguard-config
    namespace: adguard
data:
    AdGuardHome.yaml: |
      bind_host: 0.0.0.0
      bind_port: 3000
      users:
      - name: admin
        password: <your password goes here>
      http_proxy: ""
      language: ""
      rlimit_nofile: 0
      debug_pprof: false
      web_session_ttl: 720
      dns:
        bind_host: 0.0.0.0
        port: 53
        statistics_interval: 1
        querylog_enabled: true
        querylog_file_enabled: true
        querylog_interval: 7
        querylog_size_memory: 1000
        anonymize_client_ip: false
        protection_enabled: true
        blocking_mode: default
        blocking_ipv4: ""
        blocking_ipv6: ""
        blocked_response_ttl: 10
        parental_block_host: family-block.dns.adguard.com
        safebrowsing_block_host: standard-block.dns.adguard.com
        ratelimit: 20
        ratelimit_whitelist: []
        refuse_any: true
        upstream_dns:
        - '[/google.com/]tls://8.8.8.8'
        - tls://1.1.1.1
        - tls://1.0.0.1
        upstream_dns_file: ""
        bootstrap_dns:
        - 9.9.9.10
        - 149.112.112.10
        - 2620:fe::10
        - 2620:fe::fe:10
        all_servers: false
        fastest_addr: false
        allowed_clients: []
        disallowed_clients: []
        blocked_hosts:
        - version.bind
        - id.server
        - hostname.bind
        cache_size: 4194304
        cache_ttl_min: 0
        cache_ttl_max: 0
        bogus_nxdomain: []
        aaaa_disabled: false
        enable_dnssec: false
        edns_client_subnet: false
        max_goroutines: 300
        ipset: []
        filtering_enabled: true
        filters_update_interval: 24
        parental_enabled: false
        safesearch_enabled: false
        safebrowsing_enabled: false
        safebrowsing_cache_size: 1048576
        safesearch_cache_size: 1048576
        parental_cache_size: 1048576
        cache_time: 30
        rewrites: []
        blocked_services: []
      tls:
        enabled: false
        server_name: ""
        force_https: false
        port_https: 443
        port_dns_over_tls: 853
        port_dns_over_quic: 784
        allow_unencrypted_doh: false
        strict_sni_check: false
        certificate_chain: ""
        private_key: ""
        certificate_path: ""
        private_key_path: ""
      filters:
      - enabled: true
        url: https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt
        name: AdGuard DNS filter
        id: 1
      - enabled: false
        url: https://adaway.org/hosts.txt
        name: AdAway Default Blocklist
        id: 2
      - enabled: false
        url: https://www.malwaredomainlist.com/hostslist/hosts.txt
        name: MalwareDomainList.com Hosts List
        id: 4
      whitelist_filters: []
      user_rules: []
      dhcp:
        enabled: false
        interface_name: ""
        dhcpv4:
          gateway_ip: ""
          subnet_mask: ""
          range_start: ""
          range_end: ""
          lease_duration: 86400
          icmp_timeout_msec: 1000
          options: []
        dhcpv6:
          range_start: ""
          lease_duration: 86400
          ra_slaac_only: false
          ra_allow_slaac: false
      clients: []
      log_compress: false
      log_localtime: false
      log_max_backups: 0
      log_max_size: 100
      log_max_age: 3
      log_file: ""
      verbose: false
      schema_version: 7

# --------------------------------------------------------------------------------
# Adguard
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: adguard
  namespace: adguard
  labels:
    app: adguard
spec:
  selector:
    matchLabels:
      app: adguard
  template:
    metadata:
      labels:
        app: adguard
        name: adguard
    spec:
      initContainers:
      - name: adguard-init-script
        image: alpine
        command: ["sh", "/tmp/adguard-init.sh"]
        volumeMounts:
          - name: adguard-config
            mountPath: /opt/adguardhome/conf
          - name: adguard-configmap
            mountPath: /tmp/AdGuardHome.yaml
            subPath: AdGuardHome.yaml
          - name: adguard-init
            mountPath: /tmp/adguard-init.sh
            subPath: adguard-init.sh
      containers:
      - name: adguard
        image: adguard/adguardhome
        imagePullPolicy: IfNotPresent
        livenessProbe:
          httpGet:
            path: /login.html
            port: 3000
            scheme: HTTP
          initialDelaySeconds: 30
          failureThreshold: 5
          timeoutSeconds: 10
        readinessProbe:
          httpGet:
            path: /login.html
            port: 3000
            scheme: HTTP
          initialDelaySeconds: 10
          failureThreshold: 5
          timeoutSeconds: 10
        env:
        - name: TZ
          value: "America/Montreal"
        volumeMounts:
          - name: adguard-config
            mountPath: /opt/adguardhome/conf
      volumes:
        - name: adguard-config
          emptyDir: {}
        - name: adguard-configmap
          configMap:
              name: adguard-config
        - name: adguard-init
          configMap:
              name: adguard-init

# --------------------------------------------------------------------------------
# Service - Adguard DNS - with keepalived
# ---
# apiVersion: v1
# kind: Service
# metadata:
#   name: adguard-dns
#   namespace: adguard
# spec:
#   selector:
#     app: adguard
#   ports:
#   - port: 53
#     targetPort: 53
#     protocol: TCP
#     name: adguard-dns-tcp
#   - port: 53
#     targetPort: 53
#     protocol: UDP
#     name: adguard-dns-udp
#   externalIPs:
#     - 192.168.0.42
#     - 192.168.0.43

# --------------------------------------------------------------------------------
# Service - Adguard DNS - Load Balancer - 192.168.0.42
---
apiVersion: v1
kind: Service
metadata:
  name: adguard-dns-udp-42
  namespace: adguard
  annotations:
    metallb.universe.tf/address-pool: dns
    metallb.universe.tf/allow-shared-ip: dns
spec:
  selector:
    app: adguard
  ports:
  - port: 53
    targetPort: 53
    protocol: UDP
    name: adguard-dns-udp
  type: LoadBalancer
  loadBalancerIP: 192.168.0.42

# --------------------------------------------------------------------------------
# Service - Adguard DNS - Load Balancer - 192.168.0.43
---
apiVersion: v1
kind: Service
metadata:
  name: adguard-dns-udp-43
  namespace: adguard
  annotations:
    metallb.universe.tf/address-pool: dns
    metallb.universe.tf/allow-shared-ip: dns
spec:
  selector:
    app: adguard
  ports:
  - port: 53
    targetPort: 53
    protocol: UDP
    name: adguard-dns-udp
  type: LoadBalancer
  loadBalancerIP: 192.168.0.43

# --------------------------------------------------------------------------------
# Service - Adguard HTTP
---
apiVersion: v1
kind: Service
metadata:
  name: adguard-http
  namespace: adguard
  annotations:
    traefik.ingress.kubernetes.io/service.sticky.cookie: "true"
    traefik.ingress.kubernetes.io/service.sticky.cookie.name: adguard
spec:
  selector:
    app: adguard
  ports:
  - port: 80
    targetPort: 3000
    protocol: TCP
    name: http

# --------------------------------------------------------------------------------
# Ingress
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: adguard-http-ingress
  namespace: adguard
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
  - host: adguard.lab.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: adguard-http
            port:
              name: http
jeremygaither commented 3 years ago

@mzac Great work, this is similar to what I have wanted to do. Using a DaemonSet is a good idea! Question: are you using the init container because the image doesn't have /opt/adguardhome/{conf,work} yet, or for some other reason? It seems like the ConfigMap could be mapped directly to the path used by the app (either default or via CLI option), unless I'm missing something.

As for the query log, I don't know what data structure it uses internally, but a sidecar container could read query log files and push them into Redis or some other data store. Or something something Grafana, since AGH likely wouldn't be easily compatible with Redis or other data stores. A sidecar could also publish Prometheus endpoints, so the perf/query data could get scraped by some other reporting process.

mzac commented 3 years ago

@mzac Great work, this is similar to what I have wanted to do. Using a DaemonSet is a good idea! Question: are you using the init container because the image doesn't have /opt/adguardhome/{conf,work} yet, or for some other reason? It seems like the ConfigMap could be mapped directly to the path used by the app (either default or via CLI option), unless I'm missing something.

As for the query log, I don't know what data structure it uses internally, but a sidecar container could read query log files and push them into Redis or some other data store. Or something something Grafana, since AGH likely wouldn't be easily compatible with Redis or other data stores. A sidecar could also publish Prometheus endpoints, so the perf/query data could get scraped by some other reporting process.

@jeremygaither It has been a few weeks since I got it up and running but I think I needed the init container because adguard was being finicky about how that file wanted to be in the directory. I think the container was creating the dir and somehow even if I mapped the file as a config map it just wouldn't work which is why I went the init container method to copy it into the correct directory. I know it is not clean and elegant, but it works!

As for the query log, it looks like it is a plain text/json file in /opt/adguardhome/work/data/querylog.json so it would just need to be scraped and shipped off somewhere with a sidecar as you suggest.

I would like the idea of sending the logs off into either graylog or elasticsearch directly and then using grafana to create some nice dashboards.

Do you have any ideas if there are any containers out there already that would do that? (I'm assuming there it I just haven't looked yet)

jeremygaither commented 3 years ago

@mzac You might be able to use FluentD to read in the json and pipe them to graylog or elastisearch. I believe I've done this with fluentd piping to elastic and elsewhere for other things (specifically k8s logs and pod logs). I recall existing helm charts for fluentd-elastic that made it fairly easy.

Otherwise, it wouldn't be hard to write a small sidecar specifically for this.

onedr0p commented 3 years ago

What about using Grafana's logging tool Loki? It's pretty simple to send logs off to that and have it display in Grafana using a sidecar in your Adguard manifests.

tomlawesome commented 3 years ago

@mzac Thank you for posting - this is potentially really useful for me and a great reason to look at setting up a Pi cluster for the first time. I have wanted to integrate traefik somewhere, so it's great to know you have this working in such a way. Following with interest regards logs :)

Ability to feed into Grafana is ideal.

abstractvector commented 3 years ago

I'm not running AdGuard Home in k8s, although a clustered / redundant setup is something I'd like in future. However, I have been successfully ingesting the query log using Telegraf and injecting it into InfluxDB to be rendered by Grafana. I'm no Telegraf expert, but here's the config that's working for me:

[[inputs.tail]]
  files = ["/opt/AdGuardHome/data/querylog.json"]
  data_format = "json"
  tag_keys = [
    "IP",
    "QH",
    "QT",
    "QC",
    "CP",
    "Upstream",
    "Result_Reason"
  ]

  json_name_key = "query"
  json_time_key = "T"
  json_time_format = "2006-01-02T15:04:05.999999999Z07:00"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "IP"
    result_key = "IP_24"
    pattern = "^(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)$"
    replacement = "${1}.${2}.${3}.x"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "QH"
    result_key = "TLD"
    pattern = "^.*?(?P<tld>[^.]+\\.[^.]+)$"
    replacement = "${tld}"

[[outputs.influxdb]]
  urls = ["http://influxdb.local:8086"]
  database = "adguard"

You'll see I've also included a few processors that give me some extra useful stats to chart, including the origin subnet (useful because I use VLANs which map to subnets) and the TLD of the requested domain. It would be easy to run a Telegraf agent in each AdGuard Home pod (mine is actually running in an LXC container on Proxmox rather than k8s) to centralize all your logging.

One thing to note is you'll also want to adjust the querylog_size_memory in the AdGuardHome.yaml file. This is how many log entries it keeps in memory before flushing to the query log. I think it defaults to 1,000 but I dropped it to 5 to keep the data flowing smoothly to Telegraf.

The end result looks like this:

Screen Shot 2021-01-14 at 7 26 53 AM

Hope that helps! @tomlawesome @onedr0p @mzac @jeremygaither

mzac commented 3 years ago

@abstractvector That looks great! I'll give telegraf a try, I have used it a few times. Would you mind sharing the json for your grafana dashboard? (Maybe as a gist?)

abstractvector commented 3 years ago

@abstractvector That looks great! I'll give telegraf a try, I have used it a few times. Would you mind sharing the json for your grafana dashboard? (Maybe as a gist?)

Sure thing, here you go: https://gist.github.com/abstractvector/7de922c17bf3948792f42f71439959e0

There are a few field aliases in there for my DNS upstreams that you'll want to update, and you'll also need the pie chart plugin on Grafana if you don't already.

I'm no Grafana expert, so there may well be better ways of doing some of the things I've done / presenting the data, so feel free to use it as a starting point for building something better!

mzac commented 3 years ago

@abstractvector That looks great! I'll give telegraf a try, I have used it a few times. Would you mind sharing the json for your grafana dashboard? (Maybe as a gist?)

Sure thing, here you go: https://gist.github.com/abstractvector/7de922c17bf3948792f42f71439959e0

There are a few field aliases in there for my DNS upstreams that you'll want to update, and you'll also need the pie chart plugin on Grafana if you don't already.

I'm no Grafana expert, so there may well be better ways of doing some of the things I've done / presenting the data, so feel free to use it as a starting point for building something better!

Thanks! I got the telegraf sidecar running and pushing into influx and now the Grafana dashboard is working! Very awesome!

Now to explore the data a little more...

filikun commented 3 years ago

User case is like most here to be able to have two instances of Adguard home syncing settings and stats between the two. Let one be the controller and the other slave of some sort.

mzac commented 3 years ago

User case is like most here to be able to have two instances of Adguard home syncing settings and stats between the two. Let one be the controller and the other slave of some sort.

Sounds just about right. They could synchronize in the backend with a database (mongo, redis, sql, etc). That way you don't really need a 'master' per say. It would be nice if when changing settings on one the others would notice it right away without having to reload.

For my setup I use a kubernetes config map so any change I make I need to reload the pods for now but for sure having the dashboards synchronized would be great!

t0mer commented 3 years ago

Well, Finally got it working AdGuardHome, 3 nodes clustering with Configuratrion Replication, Grafana Dashboard and even Uptime Monitor for each node (not included in the video). https://youtu.be/zyz3jQ6_s4A

tomlawesome commented 3 years ago

Well, Finally got it working AdGuardHome, 3 nodes clustering with Configuratrion Replication, Grafana Dashboard and even Uptime Monitor for each node (not included in the video). https://youtu.be/zyz3jQ6_s4A

Awesome.

Can you please share config info? I've set up a new swarm, but am struggling to get 3 instances tied together and would really love to.

t0mer commented 3 years ago

Well, Finally got it working AdGuardHome, 3 nodes clustering with Configuratrion Replication, Grafana Dashboard and even Uptime Monitor for each node (not included in the video). https://youtu.be/zyz3jQ6_s4A

Awesome.

Can you please share config info? I've set up a new swarm, but am struggling to get 3 instances tied together and would really love to.

Working on it, Hopefully will uploadto my git un a few days

tomlawesome commented 3 years ago

Well, Finally got it working AdGuardHome, 3 nodes clustering with Configuratrion Replication, Grafana Dashboard and even Uptime Monitor for each node (not included in the video). https://youtu.be/zyz3jQ6_s4A

Awesome. Can you please share config info? I've set up a new swarm, but am struggling to get 3 instances tied together and would really love to.

Working on it, Hopefully will uploadto my git un a few days

Thank you! Look forward to it, maybe I can resolve the setup for myself :)

lensherm commented 3 years ago

Well, Finally got it working AdGuardHome, 3 nodes clustering with Configuratrion Replication, Grafana Dashboard and even Uptime Monitor for each node (not included in the video). https://youtu.be/zyz3jQ6_s4A

Awesome. Can you please share config info? I've set up a new swarm, but am struggling to get 3 instances tied together and would really love to.

Working on it, Hopefully will uploadto my git un a few days

Thank you sir! I got a few Pi4s waiting to be configured as soon as you post your configs.

lensherm commented 3 years ago

@ameshkov , on a somewhat related note, I know this has been brought up a few times before, but is there any chance to bake the synchronization of the pertinent portion of the config into the product? Maybe split this subset of configuration parameters into a separate config doc? Never mind this if it's already on the roadmap. From the end user perspective, it just would be super nice to not have to set up elaborate load balancing, clustering, or virtual IP solutions for a home network.

In my specific case, I'm running AGH as a Home Assistant add-on, inside a VM. Every time I have to perform maintenance requiring restart of that VM, or the host machine, for example if there's an OS update, the DNS resolution in the entire house goes down. The reason I'm not serving another DNS server through DHCP, is because I'm using AGH to limit what the kids get access to. It would be super nice to have a second instance of AGH running with similar configuration, somewhere in another VM, container, or even an RPi.

As an aside, since I started speaking about parental controls, is it possible to expand on the concept of tags to arrive at a semblance of client groups, OR to be able to filter the client list by using tags AND to perform bulk actions on the result of the filtering? Extra bonus if you could save these filtered lists, for later use. An example of the use case for this would be to turn off access to, let's say YouTube, during school time, not only on child devices, but also on the media players and smart TVs in the house. And a double extra bonus would be if these operations could be performed through an API, so one could automate or schedule actions from external platforms, such as NodeRed, Homebridge, Home Assistant, etc.

Sorry for the long post and for stuffing more than one feature request in a single ticket.

lensherm commented 3 years ago

In the meantime, I'll try to find some time to do some rsync/udp load balance voodoo with something like this here: https://hub.docker.com/r/instantlinux/udp-nginx-proxy

ameshkov commented 3 years ago

@lensherm well, what we should do prior to anything is implement the proper configuration reload that does not require restarting AGH. Only when we have this, implementing sync becomes doable.

lensherm commented 3 years ago

@lensherm well, what we should do prior to anything is implement the proper configuration reload that does not require restarting AGH. Only when we have this, implementing sync becomes doable.

Sounds like a good plan, @ameshkov. One question on this, since the specific portions of the configuration we're discussing here are accessible through the UI and don't seem to require a restart, can't this subset of the configuration be kept in sync by issuing calls to the back ends of each required AGH instance, every time new settings are saved?

ameshkov commented 3 years ago

can't this subset of the configuration be kept in sync by issuing calls to the back ends of each required AGH instance, every time new settings are saved?

Generally, one can use AdGuard Home API and write a sync tool that does that. But this sort of solution would be quite hard to maintain.

bakito commented 3 years ago

Hi all

I'm currently working on a sync tool using the API as suggested by @ameshkov.

An early draft can be found here: https://github.com/bakito/adguardhome-sync

Gorian commented 3 years ago

Just to contribute to the discussion - adding multiple IPs to your DHCP config or resolv.conf isn't the only way of running multiple DNS servers - for the more technically inclined, I've had good luck with using BGP to provide anycast DNS locally which brings the benefit of HA to your DNS implementation - if one DNS server on your network goes down, BGP simply won't route to it anymore. I know it's out of scope for AdGuard to implement, but since people have been commenting about the "how to" of using multiple DNS servers and I didn't see anyone else mention anycast as a possible solution, I thought I'd mention it.