0xERR0R / blocky

Fast and lightweight DNS proxy as ad-blocker for local network with many features
https://0xERR0R.github.io/blocky/
Apache License 2.0
4.59k stars 202 forks source link

Blocky crashes when query returns no result #1072

Closed smileymattj closed 1 year ago

smileymattj commented 1 year ago

If you query for a domain that doesn't return a result, blocky crashes.

I have blocky running on FreeBSD in the example below. But also happens on Debian.

From the queries to Google & Cloudflare, neither return an A-record result for www.shallalist.de.

In the example blocky is running fine and is responding to queries. But once "www.shallalist.de" is queried. Blocky stops responding and is no longer running.

Unrelated, but I just wanted to say blocky is great, keep up the great work!

blockyadmin@dns:~ # blocky version
blocky
Version: 0.21
Build time: 20230625-072248
Architecture: undefined
blockyadmin@dns:~ # service blocky status
blocky is running as pid 17622.
blockyadmin@dns:~ # dig @8.8.8.8 www.shallalist.de

; <<>> DiG 9.18.13 <<>> @8.8.8.8 www.shallalist.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13051
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.shallalist.de.     IN  A

;; Query time: 177 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Sun Jul 16 21:19:49 EDT 2023
;; MSG SIZE  rcvd: 46

blockyadmin@dns:~ # dig @1.1.1.1 www.shallalist.de

; <<>> DiG 9.18.13 <<>> @1.1.1.1 www.shallalist.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31782
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; EDE: 22 (No Reachable Authority): (at delegation shallalist.de.)
;; QUESTION SECTION:
;www.shallalist.de.     IN  A

;; Query time: 248 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Sun Jul 16 21:19:55 EDT 2023
;; MSG SIZE  rcvd: 80

blockyadmin@dns:~ # dig @127.0.0.1 google.com

; <<>> DiG 9.18.13 <<>> @127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15529
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 0
; EDE: 13 (Cached Error): (CACHED)
;; QUESTION SECTION:
;google.com.            IN  A

;; ANSWER SECTION:
google.com.     226 IN  A   172.253.122.113
google.com.     226 IN  A   172.253.122.139
google.com.     226 IN  A   172.253.122.100
google.com.     226 IN  A   172.253.122.138
google.com.     226 IN  A   172.253.122.101
google.com.     226 IN  A   172.253.122.102

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Sun Jul 16 21:15:54 EDT 2023
;; MSG SIZE  rcvd: 147

blockyadmin@dns:~ # dig @127.0.0.1 example.com

; <<>> DiG 9.18.13 <<>> @127.0.0.1 example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18469
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: b7439431a8b5702a0100000064b49668c1f396bc3044854c (good)
;; QUESTION SECTION:
;example.com.           IN  A

;; ANSWER SECTION:
example.com.        1800    IN  A   93.184.216.34

;; Query time: 61 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Sun Jul 16 21:16:24 EDT 2023
;; MSG SIZE  rcvd: 84

blockyadmin@dns:~ # dig @127.0.0.1 www.shallalist.de
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: connection refused

; <<>> DiG 9.18.13 <<>> @127.0.0.1 www.shallalist.de
; (1 server found)
;; global options: +cmd
;; no servers could be reached

blockyadmin@dns:~ # dig @127.0.0.1 google.com
;; communications error to 127.0.0.1#53: connection refused
;; communications error to 127.0.0.1#53: connection refused
;; communications error to 127.0.0.1#53: connection refused

; <<>> DiG 9.18.13 <<>> @127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; no servers could be reached

blockyadmin@dns:~ # service blocky status
blocky is not running.
ThinkChaos commented 1 year ago

Thanks for the report! I tested querying that domain using dig with and Blocky v0.21 but couldn't reproduce.

Could you please post the relevant part of Blocky's log? Ideally with trace level (feel free to censor IPs and whatnot).

smileymattj commented 1 year ago

Is trace the same thing as debug? This is what the log showed when I changed the log level to debug. It seemed like all these entries were entered at startup. I don't think any came after the query, but I might be mistaken.

goroutine 492 [running]:
github.com/0xERR0R/blocky/resolver.addExtraReasoning(0x854214370?)
    github.com/0xERR0R/blocky/resolver/ede_resolver.go:42 +0x18
github.com/0xERR0R/blocky/resolver.(*EdeResolver).Resolve(0x854374588, 0x855210b48?)
    github.com/0xERR0R/blocky/resolver/ede_resolver.go:24 +0x48
github.com/0xERR0R/blocky/resolver.(*ClientNamesResolver).Resolve(0x854214320, 0x854469b20)
    github.com/0xERR0R/blocky/resolver/client_names_resolver.go:79 +0x18c
github.com/0xERR0R/blocky/resolver.(*FqdnOnlyResolver).Resolve(0xdde600?, 0x854458c90?)
    github.com/0xERR0R/blocky/resolver/fqdn_only_resolver.go:34 +0xf4
github.com/0xERR0R/blocky/resolver.(*FilteringResolver).Resolve(0x854374558, 0x854469b20)
    github.com/0xERR0R/blocky/resolver/filtering_resolver.go:29 +0x1bc
github.com/0xERR0R/blocky/server.(*Server).OnRequest(0x8543180f0, {0x102cd58, 0x85440a480}, 0x854339290)
    github.com/0xERR0R/blocky/server/server.go:617 +0xee
github.com/miekg/dns.HandlerFunc.ServeDNS(0x854478880?, {0x102cd58?, 0x85440a480?}, 0x854330001?)
    github.com/miekg/dns@v1.1.50/server.go:37 +0x2f
github.com/miekg/dns.(*ServeMux).ServeDNS(0x854570000?, {0x102cd58?, 0x85440a480?}, 0x854339290?)
    github.com/miekg/dns@v1.1.50/serve_mux.go:103 +0x72
github.com/miekg/dns.(*Server).serveDNS(0x85443e900, {0x854570000, 0x3a, 0xffff}, 0x85440a480)
    github.com/miekg/dns@v1.1.50/server.go:659 +0x462
github.com/miekg/dns.(*Server).serveUDPPacket(0x85443e900, 0x874eb8300?, {0x854570000, 0x3a, 0xffff}, {0x102ab70?, 0x85428c010}, 0x854208560, {0x0, 0x0})
    github.com/miekg/dns@v1.1.50/server.go:603 +0x1dc
created by github.com/miekg/dns.(*Server).serveUDP
    github.com/miekg/dns@v1.1.50/server.go:533 +0x485

The last entry in the query log is the google entry that was successful. I reran it just now since it would have a lot of traffic from yesterday. Then the log ends with no entry added when querying: www.shallalist.de.

2023-07-17 18:34:40 127.0.0.1 127.0.0.1 46 RESOLVED (tcp-tls:dns11.quad9.net) google.com. A (142.250.105.102), A (142.250.105.139), A (142.250.105.100), A (142.250.105.101), A (142.250.105.138), A (142.250.105.113) NOERROR RESOLVED A dns

ThinkChaos commented 1 year ago

Trace generates even more logs than debug, it's useful to know what internal resolvers are used during the query. The stack trace should be enough though. It points to the "extended_error_code" resolver. Though something's weird: in version 0.21, line 42 is just a return (src).

There was a bug fixed in v0.21 that matches the description and stack trace perfectly! See #777.
Given the code doesn't match with v0.21 and v0.20 is a perfect match, my guess is you're somehow using blocky v0.20.
Can you look into if that's possible? What install method do you use?

smileymattj commented 1 year ago

That error in the log might have been from when I was running 0.20. Because I just realized 0.21 was available in pkg. So I upgraded just before opening the ticket to make sure I was up to date. It was the latest entry in the log. So the last time something was logged was when it was on 0.20. But I remember startup messages used to be logged. Does 0.21 not log startup messages anymore? Seems like nothing is getting logged anymore. I've been on 0.21 for 2-3 days and log file hasn't had any new entries.

Is this the correct way to enable trace logging: logLevel: trace

I wasn't sure if that was correct place to do it since the configuration page doesn't list that option: https://0xerr0r.github.io/blocky/v0.21/configuration/ log.level | enum (debug, info, warn, error)

ThinkChaos commented 1 year ago

Does 0.21 not log startup messages anymore?

It does, but it was changed to use logging instead of just printing to stdout, so it now depends on your log level.
Most logs are "info", so if you set it to "warn" or "error", you won't see it anymore.

I guess we could increase the messages like "listening on " but it's always weird to have that be a warning.

Is this the correct way to enable trace logging

Yeah that works. Though starting from v0.21 logLevel is deprecated in favor of:

log:
  level: trace

Sorry for the confusion about trace, I'll update the docs to add it!

Are you able to reproduce the crash with v0.21?

smileymattj commented 1 year ago

Thanks, I got the log level set to trace using the new format.

log:
  level: info
  format: text
  timestamp: true
  privacy: false

Did the log file location change? It was:/var/log/blocky.log Nothing was added to the log file recently. I moved it to .old and a new log file hasn't been created.

Yes it's still happening on v0.21.

ThinkChaos commented 1 year ago

I don't think blocky ever logs to the file directly, at least I don't use that and don't see anything related in the code. I could be wrong though.
Can you check the service manager you use for that?

If you can repro with v0.21, please post a config that I can use to repro on my side.

smileymattj commented 1 year ago

Sorry for the late reply. Thanks.

That reminds me the pre-compiled blocky that's in FreeBSD's pkg repository. The included rc.d daemon didn't specify a log file. And I had added -o switch to the command arguments to get it to log. So since I had updated it from pkg, that replaced the edit I had done with the one from the repo that doesn't have logging.

Originally it was:

pidfile="/var/run/${name}.pid"
blocky_path="/usr/local/sbin/blocky"

command="/usr/sbin/daemon"
procname="/usr/local/sbin/blocky"
command_args="-c -f -p ${pidfile} ${blocky_path} \
    -c ${blocky_config} ${blocky_flags}"

I had added:

logfile="/var/log/${name}.log"
pidfile="/var/run/${name}.pid"
blocky_path="/usr/local/sbin/blocky"

command="/usr/sbin/daemon"
procname="/usr/local/sbin/blocky"
command_args="-c -f -o ${logfile} -p ${pidfile} ${blocky_path} \
    -c ${blocky_config} ${blocky_flags}"

Here's my blocky config.yml

upstream:
  default:
      - tcp-tls:dns11.quad9.net:853

upstreamTimeout: 2s
startVerifyUpstream: true
minTlsServeVersion: 1.3
connectIPVersion: v4

blocking:
  blackLists:
    ads:
      - https://blocklistproject.github.io/Lists/ads.txt
      - https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews/hosts
      - https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
      - https://adaway.org/hosts.txt
      - https://www.github.developerdan.com/hosts/lists/ads-and-tracking-extended.txt
      - https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts_without_controversies.txt      
      - https://v.firebog.net/hosts/Easylist.txt    
      - https://v.firebog.net/hosts/static/w3kbl.txt
      - https://v.firebog.net/hosts/AdguardDNS.txt
      - https://v.firebog.net/hosts/Admiral.txt
      - https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext
      - https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt
      - https://raw.githubusercontent.com/bigdargon/hostsVN/master/hosts
      - https://raw.githubusercontent.com/Firestorrrm/Minimal-Hosts-Blocker/master/iosadlist.txt    
      - https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts    
      - https://raw.githubusercontent.com/kboghdady/youTube_ads_4_pi-hole/master/youtubelist.txt
    tracking:
      - https://blocklistproject.github.io/Lists/tracking.txt
      - https://v.firebog.net/hosts/Easyprivacy.txt
      - https://v.firebog.net/hosts/Prigent-Ads.txt
      - https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt
      - https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/hosts/spy.txt
      - https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts
      - https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt    
    malware:
      - https://blocklistproject.github.io/Lists/scam.txt
      - https://blocklistproject.github.io/Lists/fraud.txt
      - https://blocklistproject.github.io/Lists/malware.txt
      - https://blocklistproject.github.io/Lists/phishing.txt
      - https://v.firebog.net/hosts/RPiList-Phishing.txt
      - https://v.firebog.net/hosts/RPiList-Malware.txt
      - https://v.firebog.net/hosts/Prigent-Crypto.txt
      - https://urlhaus.abuse.ch/downloads/hostfile/
      - https://phishing.army/download/phishing_army_blocklist_extended.txt
      - https://raw.githubusercontent.com/AssoEchap/stalkerware-indicators/master/generated/hosts
      - https://bitbucket.org/ethanr/dns-blacklists/raw/8575c9f96e5b4a1308f2f12394abd86d0927a4a0/bad_lists/Mandiant_APT1_Report_Appendix_D.txt
      - https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt
      - https://s3.amazonaws.com/lists.disconnect.me/simple_malvertising.txt    
      - https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate%20versions%20Anti-Malware%20List/AntiMalwareHosts.txt
      - https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts
      - https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts
      - https://raw.githubusercontent.com/Spam404/lists/master/main-blacklist.txt
      - https://gitlab.com/quidsup/notrack-blocklists/raw/master/notrack-malware.txt    
    special:
      - /usr/local/etc/blocky/blacklist.txt

  whiteLists:
    ads:
      - https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt
    special:
      - /usr/local/etc/blocky/whitelist.txt

  clientGroupsBlock:
    default:
      - ads
      - tracking
      - malware
      - special

  blockType: nxDomain
  blockTTL: 1m
  refreshPeriod: 4h
  downloadTimeout: 4m
  downloadAttempts: 5
  downloadCooldown: 10s
  startStrategy: blocking

caching:
  minTime: 5m
  maxTime: 30m
  maxItemsCount: 0
  prefetching: true
  prefetchExpires: 8h
  prefetchThreshold: 2
  prefetchMaxItemsCount: 0
  cacheTimeNegative: 30m

prometheus:
  enable: false

queryLog:
  type: csv
  target: /var/log/blocky
  logRetentionDays: 7

bootstrapDns:
  upstream: tcp-tls:dns11.quad9.net:853
  ips:
    - 9.9.9.11
    - 149.112.112.11

filtering:
  queryTypes:
    - AAAA

hostsFile:
  filePath: /etc/hosts
  hostsTTL: 60m
  refreshPeriod: 30m
  filterLoopback: false

ports:
  dns: 53
  tls: 853
  http: 4000
  https: 443

log:
  level: trace
  format: text
  timestamp: true
  privacy: false

ede:
  enable: true
smileymattj commented 1 year ago

After I got logfile corrected. Sorry about that, I should have made better notes.

Here's the results for the query with level set to trace:

2023-07-28 22:26:37] DEBUG server: new request
[2023-07-28 22:26:37] TRACE custom_dns_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=hosts_file
[2023-07-28 22:26:37] TRACE hosts_file: go to next resolver resolver=blocking
[2023-07-28 22:26:37] DEBUG blacklist_resolver: checking groups for request client_ip=127.0.0.1 client_names=127.0.0.1 groupsToCheck=ads; malware; special; tracking question=A (www.shallalist.de.)
[2023-07-28 22:26:37] DEBUG caching_resolver: domain 'www.shallalist.de' was requested 1 times, total cache size: 22 client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de question=A (www.shallalist.de.)
[2023-07-28 22:26:37] DEBUG caching_resolver: not in cache: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de next_resolver=conditional_upstream question=A (www.shallalist.de.)
[2023-07-28 22:26:37] TRACE conditional_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 next_resolver=special_use_domains question=A (www.shallalist.de.)
[2023-07-28 22:26:37] DEBUG parallel_best: delegating to resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=upstream 'tcp-tls:dns11.quad9.net'
[2023-07-28 22:26:37] DEBUG bootstrap.caching_resolver: domain 'dns11.quad9.net' was requested 20 times, total cache size: 14 domain=dns11.quad9.net
[2023-07-28 22:26:37] DEBUG bootstrap.caching_resolver: domain is cached domain=dns11.quad9.net
[2023-07-28 22:26:39] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:53427->9.9.9.11:853: i/o timeout, retrying... attempt=1/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
[2023-07-28 22:26:41] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (149.112.112.11:853): read tcp 192.168.14.3:35615->149.112.112.11:853: i/o timeout, retrying... attempt=2/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=149.112.112.11
[2023-07-28 22:26:42] DEBUG server: new request
[2023-07-28 22:26:42] TRACE custom_dns_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=hosts_file
[2023-07-28 22:26:42] TRACE hosts_file: go to next resolver resolver=blocking
[2023-07-28 22:26:42] DEBUG blacklist_resolver: checking groups for request client_ip=127.0.0.1 client_names=127.0.0.1 groupsToCheck=ads; malware; special; tracking question=A (www.shallalist.de.)
[2023-07-28 22:26:42] DEBUG caching_resolver: domain 'www.shallalist.de' was requested 2 times, total cache size: 22 client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de question=A (www.shallalist.de.)
[2023-07-28 22:26:42] DEBUG caching_resolver: not in cache: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de next_resolver=conditional_upstream question=A (www.shallalist.de.)
[2023-07-28 22:26:42] TRACE conditional_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 next_resolver=special_use_domains question=A (www.shallalist.de.)
[2023-07-28 22:26:42] DEBUG parallel_best: delegating to resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=upstream 'tcp-tls:dns11.quad9.net'
[2023-07-28 22:26:42] DEBUG bootstrap.caching_resolver: domain 'dns11.quad9.net' was requested 21 times, total cache size: 14 domain=dns11.quad9.net
[2023-07-28 22:26:42] DEBUG bootstrap.caching_resolver: domain is cached domain=dns11.quad9.net
[2023-07-28 22:26:43] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:48268->9.9.9.11:853: i/o timeout, retrying... attempt=3/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
[2023-07-28 22:26:43] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:48268->9.9.9.11:853: i/o timeout, retrying... attempt=1/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
[2023-07-28 22:26:43] ERROR server: error on processing request:can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:48268->9.9.9.11:853: i/o timeout
[2023-07-28 22:26:46] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (149.112.112.11:853): read tcp 192.168.14.3:52371->149.112.112.11:853: i/o timeout, retrying... attempt=2/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=149.112.112.11
[2023-07-28 22:26:47] DEBUG server: new request
[2023-07-28 22:26:47] TRACE custom_dns_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=hosts_file
[2023-07-28 22:26:47] TRACE hosts_file: go to next resolver resolver=blocking
[2023-07-28 22:26:47] DEBUG blacklist_resolver: checking groups for request client_ip=127.0.0.1 client_names=127.0.0.1 groupsToCheck=ads; malware; special; tracking question=A (www.shallalist.de.)
[2023-07-28 22:26:47] DEBUG caching_resolver: domain 'www.shallalist.de' was requested 3 times, total cache size: 22 client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de question=A (www.shallalist.de.)
[2023-07-28 22:26:47] DEBUG caching_resolver: not in cache: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 domain=www.shallalist.de next_resolver=conditional_upstream question=A (www.shallalist.de.)
[2023-07-28 22:26:47] TRACE conditional_resolver: go to next resolver client_ip=127.0.0.1 client_names=127.0.0.1 next_resolver=special_use_domains question=A (www.shallalist.de.)
[2023-07-28 22:26:47] DEBUG parallel_best: delegating to resolver client_ip=127.0.0.1 client_names=127.0.0.1 question=A (www.shallalist.de.) resolver=upstream 'tcp-tls:dns11.quad9.net'
[2023-07-28 22:26:47] DEBUG bootstrap.caching_resolver: domain 'dns11.quad9.net' was requested 22 times, total cache size: 14 domain=dns11.quad9.net
[2023-07-28 22:26:47] DEBUG bootstrap.caching_resolver: domain is cached domain=dns11.quad9.net
[2023-07-28 22:26:48] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:58760->9.9.9.11:853: i/o timeout, retrying... attempt=3/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
[2023-07-28 22:26:48] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:58760->9.9.9.11:853: i/o timeout, retrying... attempt=1/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
[2023-07-28 22:26:48] ERROR server: error on processing request:can't resolve request via upstream server tcp-tls:dns11.quad9.net (9.9.9.11:853): read tcp 192.168.14.3:58760->9.9.9.11:853: i/o timeout
[2023-07-28 22:26:50] DEBUG upstream: can't resolve request via upstream server tcp-tls:dns11.quad9.net (149.112.112.11:853): read tcp 192.168.14.3:25579->149.112.112.11:853: i/o timeout, retrying... attempt=2/3 question=A (www.shallalist.de.) upstream=tcp-tls:dns11.quad9.net upstream_ip=149.112.112.11
[2023-07-28 22:26:50] DEBUG upstream: received response from upstream answer= net=tcp-tls protocol=UDP response_time_ms=14 return_code=SERVFAIL upstream=tcp-tls:dns11.quad9.net upstream_ip=9.9.9.11
ThinkChaos commented 1 year ago

Thanks for the details.

I tried using your config and querying www.shallalist.de but could not reproduce the crash.
Are you still having it with v0.21? Can you include an up to date stack trace?

smileymattj commented 1 year ago

I was on v0.21 when running that. But for some reason the issue seems to have went away. I've tried for 3 days and no crashes. No upgrades, or config changes. I was thinking it might have been an ECS issue with quad9. I tried once last week without ECS quad9 servers and it seemed to not crash. But I wanted to test a little more when I had more time to see if that was true. But now it's not crashing so I'm not sure where the issue was coming from. You not able to reproduce also makes me think it was something with quad9. Maybe since most people don't use ECS it went unnoticed, and me testing it so much lately made it stick out in the error logs and they corrected it. Hopeful thinking. Thanks for all the help.