0xERR0R / blocky

Fast and lightweight DNS proxy as ad-blocker for local network with many features
https://0xERR0R.github.io/blocky/
Apache License 2.0
4.38k stars 199 forks source link

blocky process panics if the conditiononal upstream resolver receives a response without a question section #1113

Closed emlimap closed 9 months ago

emlimap commented 10 months ago

Have been using blocky on OpenWrt running on Raspberry Pi 4 board and have been noticing that the process panics when resolving a domain in conditional resolver list. init.d takes care of restarting the process, but there is a brief DNS outage when that happens

Below is the stack trace. Unfortunately it is random so don't have steps to reproduce the error.

panic: runtime error: index out of range [0] with length 0

goroutine 142 [running]:
github.com/0xERR0R/blocky/resolver.(*ConditionalUpstreamResolver).internalResolve(0x9a5a60?, {0xcbd510?, 0x40000a5020}, {0x4000136168, 0x10}, {0x4000136171, 0x7},

      /home/runner/work/blocky/blocky/resolver/conditional_upstream_resolver.go:111 +0x368
github.com/0xERR0R/blocky/resolver.(*ConditionalUpstreamResolver).processRequest(0x4000031260, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/conditional_upstream_resolver.go:63 +0x18c
github.com/0xERR0R/blocky/resolver.(*ConditionalUpstreamResolver).Resolve(0x4000031260, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/conditional_upstream_resolver.go:88 +0x70
github.com/0xERR0R/blocky/resolver.(*CachingResolver).Resolve(0x40000718c0, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/caching_resolver.go:189 +0x3b4
github.com/0xERR0R/blocky/resolver.(*BlockingResolver).Resolve(0x40004ac600, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/blocking_resolver.go:399 +0x98
github.com/0xERR0R/blocky/resolver.(*HostsFileResolver).Resolve(0x400009ce70?, 0x400018d7f8?)
      /home/runner/work/blocky/blocky/resolver/hosts_file_resolver.go:106 +0x3f4
github.com/0xERR0R/blocky/resolver.(*CustomDNSResolver).Resolve(0x4000033b40, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/custom_dns_resolver.go:143 +0x174
github.com/0xERR0R/blocky/resolver.(*MetricsResolver).Resolve(0x400008ccd0, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/metrics_resolver.go:29 +0x38
github.com/0xERR0R/blocky/resolver.(*QueryLoggingResolver).Resolve(0x4000033b00, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/query_logging_resolver.go:113 +0x6c
github.com/0xERR0R/blocky/resolver.(*EdeResolver).Resolve(0x400009ccb0?, 0x4000273b18?)
      /home/runner/work/blocky/blocky/resolver/ede_resolver.go:24 +0x88
github.com/0xERR0R/blocky/resolver.(*ClientNamesResolver).Resolve(0x400008cc30, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/client_names_resolver.go:65 +0x150
github.com/0xERR0R/blocky/resolver.(*FqdnOnlyResolver).Resolve(0x40000312c0, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/fqdn_only_resolver.go:36 +0x108
github.com/0xERR0R/blocky/resolver.(*FilteringResolver).Resolve(0x4000031290, 0x400009cd20)
      /home/runner/work/blocky/blocky/resolver/filtering_resolver.go:33 +0x1b0
github.com/0xERR0R/blocky/server.(*Server).OnRequest(0x40002100f0, {0xcc4620, 0x40004ac380}, 0x4000532480)
      /home/runner/work/blocky/blocky/server/server.go:610 +0xc8
github.com/miekg/dns.HandlerFunc.ServeDNS(0x40000a4060?, {0xcc4620?, 0x40004ac380?}, 0x4000530001?)
      /home/runner/go/pkg/mod/github.com/miekg/dns@v1.1.52/server.go:37 +0x38
github.com/miekg/dns.(*ServeMux).ServeDNS(0x400032a000?, {0xcc4620?, 0x40004ac380?}, 0x4000532480?)
      /home/runner/go/pkg/mod/github.com/miekg/dns@v1.1.52/serve_mux.go:103 +0x78
github.com/miekg/dns.(*Server).serveDNS(0x4000096120, {0x400032a000, 0x22, 0xffff}, 0x40004ac380)
      /home/runner/go/pkg/mod/github.com/miekg/dns@v1.1.52/server.go:659 +0x3a0
github.com/miekg/dns.(*Server).serveUDPPacket(0x4000096120, 0x28dc5c?, {0x400032a000, 0x22, 0xffff}, {0xcc2400?, 0x400011a038}, 0x40004c0d20, {0x0?, 0x0})
      /home/runner/go/pkg/mod/github.com/miekg/dns@v1.1.52/server.go:603 +0x1b8
created by github.com/miekg/dns.(*Server).serveUDP
      /home/runner/go/pkg/mod/github.com/miekg/dns@v1.1.52/server.go:533 +0x3e0

Blocky version. Running the arm64 binary available in github releases.

blocky
Version: v0.21
Build time: 20230327-054311
Architecture: undefined

OpenWRT version

NAME="OpenWrt"
VERSION="22.03.5"
ID="openwrt"
ID_LIKE="lede openwrt"
PRETTY_NAME="OpenWrt 22.03.5"
VERSION_ID="22.03.5"
HOME_URL="https://openwrt.org/"
BUG_URL="https://bugs.openwrt.org/"
SUPPORT_URL="https://forum.openwrt.org/"
BUILD_ID="r20134-5f15225c1e"
OPENWRT_BOARD="bcm27xx/bcm2711"
OPENWRT_ARCH="aarch64_cortex-a72"
OPENWRT_TAINTS=""
OPENWRT_DEVICE_MANUFACTURER="OpenWrt"
OPENWRT_DEVICE_MANUFACTURER_URL="https://openwrt.org/"
OPENWRT_DEVICE_PRODUCT="Generic"
OPENWRT_DEVICE_REVISION="v0"
OPENWRT_RELEASE="OpenWrt 22.03.5 r20134-5f15225c1e"

Conditional upstream config is fairly basic.

conditional:
  mapping:
    jio.com: '192.168.10.1, 192.168.11.1, 2405:200:800::3, 49.45.0.3'
    jiokhelo.com: '192.168.10.1, 192.168.11.1, 2405:200:800::3, 49.45.0.3'
    jfdca.net: '192.168.10.1, 192.168.11.1, 2405:200:800::3, 49.45.0.3'
    jiocinema.com: '192.168.10.1, 192.168.11.1, 2405:200:800::3, 49.45.0.3'
    voot.com: '192.168.10.1, 192.168.11.1, 2405:200:800::3, 49.45.0.3'
kwitsch commented 10 months ago

It looks to me like the response from your upstream resolver is broken. The request is resolved(no error is thrown) but the question section is somehow empty(Question[0] don't exist). My first guess would be that there is a configuration error in your upstream DNS server.

Since there are private addresses in your config: Do you host the upstream servers yourself? What servers do you use? Have you checked the log entries of your upstreams for the timeframe the crashes happen?

kwitsch commented 10 months ago

Either the check in line 111 of the conditiononal upstream resolver has to be enhanced or a new logic has to be implemented to add the question section of the request if it is missing in the response to combat crashes during broken upstream responses.

emlimap commented 10 months ago

@kwitsch Thank you for explaining what is causing the panic.

The upstream servers with Private IPs are stub DNS resolvers running on the ISP provided routers. Unfortunately, they are heavily locked down and don't expose any sort of logs.

My understanding is that they are using MIPS processor and manufactured by Sercomm.

Decided to remove the router IP's and query the ISP DNS servers directly, those are the public IPv4 & IPv6 addressed in the above list. Let's see if they still crash, as it would help isolate if the bad responses are coming from the stub resolver or the recursive resolver operated by the ISP.

emlimap commented 10 months ago

I found the issue. The ISP DNS server was refusing to resolve the domain as the traffic was being sent on a backup 4G connection from a different provider due to an issue with policy based routing.

Below is the response from the ISP DNS server when that happens. It does respond without the question section.

$ dig google.com @49.45.0.3

; <<>> DiG 9.18.16 <<>> google.com @49.45.0.3
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 1024
;; flags: qr rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; Query time: 19 msec
;; SERVER: 49.45.0.3#53(49.45.0.3) (UDP)
;; WHEN: Wed Aug 23 13:44:31 UTC 2023
;; MSG SIZE  rcvd: 12
kwitsch commented 10 months ago

@0xERR0R based on the results of @emlimap I would opt for enhancing the line 111 check to include the presence of at least one question section and treating responses without them as errors.

0xERR0R commented 10 months ago

Yes, that sounds reasonable 👍

ThinkChaos commented 10 months ago

It might also be interesting to check what resolver created a model.Response without filling in Res and fix that too since AFAIC remember that should never be the case.

arsfeld commented 6 months ago

Just a quick note about this issue, it popped up lately for me, with Blocky restarting every 20 min or so. I don't know what's the root cause though.

I'm now running Blocky master and the issue is gone. I was running the latest release before.

ThinkChaos commented 6 months ago

Yeah the fix was done in the PR GitHub linked just above (easy to miss though!): #1148, next release will have it :)