Closed emlimap closed 9 months ago
It looks to me like the response from your upstream resolver is broken. The request is resolved(no error is thrown) but the question section is somehow empty(Question[0] don't exist). My first guess would be that there is a configuration error in your upstream DNS server.
Since there are private addresses in your config: Do you host the upstream servers yourself? What servers do you use? Have you checked the log entries of your upstreams for the timeframe the crashes happen?
Either the check in line 111 of the conditiononal upstream resolver has to be enhanced or a new logic has to be implemented to add the question section of the request if it is missing in the response to combat crashes during broken upstream responses.
@kwitsch Thank you for explaining what is causing the panic.
The upstream servers with Private IPs are stub DNS resolvers running on the ISP provided routers. Unfortunately, they are heavily locked down and don't expose any sort of logs.
My understanding is that they are using MIPS processor and manufactured by Sercomm.
Decided to remove the router IP's and query the ISP DNS servers directly, those are the public IPv4 & IPv6 addressed in the above list. Let's see if they still crash, as it would help isolate if the bad responses are coming from the stub resolver or the recursive resolver operated by the ISP.
I found the issue. The ISP DNS server was refusing to resolve the domain as the traffic was being sent on a backup 4G connection from a different provider due to an issue with policy based routing.
Below is the response from the ISP DNS server when that happens. It does respond without the question section.
$ dig google.com @49.45.0.3
; <<>> DiG 9.18.16 <<>> google.com @49.45.0.3
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 1024
;; flags: qr rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; Query time: 19 msec
;; SERVER: 49.45.0.3#53(49.45.0.3) (UDP)
;; WHEN: Wed Aug 23 13:44:31 UTC 2023
;; MSG SIZE rcvd: 12
@0xERR0R based on the results of @emlimap I would opt for enhancing the line 111 check to include the presence of at least one question section and treating responses without them as errors.
Yes, that sounds reasonable 👍
It might also be interesting to check what resolver created a model.Response
without filling in Res
and fix that too since AFAIC remember that should never be the case.
Just a quick note about this issue, it popped up lately for me, with Blocky restarting every 20 min or so. I don't know what's the root cause though.
I'm now running Blocky master and the issue is gone. I was running the latest release before.
Yeah the fix was done in the PR GitHub linked just above (easy to miss though!): #1148, next release will have it :)
Have been using blocky on OpenWrt running on Raspberry Pi 4 board and have been noticing that the process panics when resolving a domain in conditional resolver list. init.d takes care of restarting the process, but there is a brief DNS outage when that happens
Below is the stack trace. Unfortunately it is random so don't have steps to reproduce the error.
Blocky version. Running the arm64 binary available in github releases.
OpenWRT version
Conditional upstream config is fairly basic.