PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.63k stars 904 forks source link

Make recursor reply to queries with OPCODE=2 #10624

Open ghost opened 3 years ago

ghost commented 3 years ago

Short description

Make recursor reply to queries with OPCODE=2

Usecase

Found this while debugging why we had a frequent increase in drop counters in Dnsdist. We have quite a lot clients sending google.com queries with OPCODE=2 set. What and why they do it is out of this scope.

Pdns-recursor will timeout these questions which makes the drop counters increase. Another side effect is that other metric in Dnsdist looks bad/odd.

Description

For example in topSlow it looks like google.com is the slowest domain.

topSlow()
1 google.com. 26 23.2% 2 iiofouooxxav.partek-forest.parnet.net. 4 3.6% 3 lwksdtrx.partek-forest.parnet.net. 3 2.7% 4 aprint08.elevad.umea.se. 3 2.7% 5 gnujbanho.partek-forest.parnet.net. 3 2.7% 6 135.189.191.39.in-addr.arpa. 2 1.8% 7 rm.api.weibo.com. 2 1.8% 8 se.c1068648879.ip4-59a0409a.saasprotection.com. 2 1.8% 9 plus.maths.org. 2 1.8% 10 192.168.1.15.in-addr.arpa. 2 1.8% 11 Rest 63 56.3%

In grepq it looks like queries are bypassing the local dnsdist cache and backend is failing to respond to google.com queries.

grepq("3000ms") Time Client Server ID Name Type Lat. TC RD AA Rcode -60.5 76.x.x.x:42909 pdns-rec01:53 32765 google.com. A T.O No Error. 0 answers -59.5 76.x.x.x:36074 pdns-rec02:53 65286 google.com. A T.O No Error. 0 answers -56.5 76.x.x.x:36074 pdns-rec01:53 65286 google.com. A T.O No Error. 0 answers -55.4 76.x.x.x:52782 pdns-rec02:53 1067 google.com. A T.O No Error. 0 answers -55.4 76.x.x.x:39205 pdns-rec02:53 22750 google.com. A T.O No Error. 0 answers -55.4 76.x.x.x:39205 pdns-rec01:53 22750 google.com. A T.O No Error. 0 answers -53.4 76.x.x.x:52782 pdns-rec01:53 1067 google.com. A T.O No Error. 0 answers -52.4 76.x.x.x:39205 pdns-rec02:53 22750 google.com. A T.O No Error. 0 answers -51.4 76.x.x.x:52782 pdns-rec01:53 1067 google.com. A T.O No Error. 0 answers -51.4 76.x.x.x:39205 pdns-rec02:53 22750 google.com. A T.O No Error. 0 answers -51.4 76.x.x.x:52782 pdns-rec02:53 1067 google.com. A T.O No Error. 0 answers -50.4 76.x.x.x:18466 pdns-rec01:53 26399 135.189.191.39.in-addr.arpa. PTR T.O No Error. 0 answers -40.3 76.x.x.x:30335 pdns-rec01:53 5139 google.com. A T.O No Error. 0 answers -40.3 76.x.x.x:60562 pdns-rec02:53 7484 192.168.1.15.in-addr.arpa. PTR T.O No Error. 0 answers -38.3 76.x.x.x:30335 pdns-rec02:53 5139 google.com. A T.O No Error. 0 answers -37.2 76.x.x.x:60562 pdns-rec01:53 7484 192.168.1.15.in-addr.arpa. PTR T.O No Error. 0 answers -36.5 76.x.x.x:2899 pdns-rec01:53 62928 pull-flv-l11.ixigua.com. A 3635.8 RD No Error. 17 answers -36.2 76.x.x.x:58396 pdns-rec01:53 50458 google.com. A T.O No Error. 0 answers -33.2 76.x.x.x:58396 pdns-rec02:53 50458 google.com. A T.O No Error. 0 answers -32.2 76.x.x.x:34263 pdns-rec01:53 29176 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.elevad.umea.se. SRV T.O No Error. 0 answers -31.2 76.x.x.x:16919 pdns-rec01:53 28188 google.com. A T.O No Error. 0 answers -31.2 76.x.x.x:34263 pdns-rec02:53 29176 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.elevad.umea.se. SRV T.O No Error. 0 answers -29.4 76.x.x.x:34263 pdns-rec01:53 29176 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.elevad.umea.se. SRV 3010.1 RD Server Failure -29.2 76.x.x.x:16919 pdns-rec02:53 28188 google.com. A T.O No Error. 0 answers -28.1 76.x.x.x:34263 pdns-rec01:53 29176 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.elevad.umea.se. SRV T.O No Error. 0 answers -28.1 76.x.x.x:64390 pdns-rec02:53 43173 _ldap._tcp.ddc.mydorma.com. SRV T.O No Error. 0 answers -27.1 76.x.x.x:64390 pdns-rec02:53 43173 _ldap._tcp.ddc.mydorma.com. SRV T.O No Error. 0 answers -26.1 76.x.x.x:13457 pdns-rec01:53 3372 google.com. A T.O No Error. 0 answers -23.1 76.x.x.x:13457 pdns-rec02:53 3372 google.com. A T.O No Error. 0 answers -17.0 76.x.x.x:35604 pdns-rec02:53 12783 _ldap._tcp.Default-First-Site-Name._sites.elevad.umea.se. SRV T.O No Error. 0 answers -4.4 76.x.x.x:26754 pdns-rec01:53 54265 wpad.partek-forest.parnet.net. A 3001.7 RD Server Failure -3.9 76.x.x.x:52042 pdns-rec01:53 27375 google.com. A T.O No Error. 0 answers -3.9 76.x.x.x:26754 pdns-rec02:53 54265 wpad.partek-forest.parnet.net. A T.O No Error. 0 answers -3.9 76.x.x.x:57645 pdns-rec02:53 60448 google.com. A T.O No Error. 0 answers -2.9 76.x.x.x:26754 pdns-rec01:53 54265 wpad.partek-forest.parnet.net. A T.O No Error. 0 answers -1.9 76.x.x.x:61643 pdns-rec01:53 38372 google.com. A T.O No Error. 0 answers -1.9 76.x.x.x:2141 pdns-rec02:53 31016 google.com. A T.O No Error. 0 answers -1.9 76.x.x.x:57645 pdns-rec02:53 60448 google.com. A T.O No Error. 0 answers -1.9 76.x.x.x:54111 pdns-rec02:53 59107 p5-ipv6.douyinpic.com. A T.O No Error. 0 answers -1.9 76.x.x.x:52042 pdns-rec01:53 27375 google.com. A T.O No Error. 0 answers

On public DNS servers 8.8.8.8 are responding to OPCODE=2 queries with NOTIMP

~$ dig google.com +opcode=2 @8.8.8.8

; <<>> DiG 9.17.13-2+ubuntu20.10.1+isc+1-Ubuntu <<>> google.com +opcode=2 @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: STATUS, status: NOTIMP, id: 17866 ;; flags: qr rd; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available

;; WARNING: EDNS query returned status NOTIMP - retry with '+noedns'

;; Query time: 8 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP) ;; WHEN: Mon Aug 02 11:29:57 CEST 2021 ;; MSG SIZE rcvd: 12

OpenDNS does the same and replies with NOTIMP.

Cloudflare igonores the OPCODE=2 and defaults to OPCODE=0

~$ dig google.com +opcode=2 @1.1.1.1

; <<>> DiG 9.10.3-P4-Debian <<>> google.com +opcode=2 @1.1.1.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23062 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;google.com. IN A

;; ANSWER SECTION: google.com. 129 IN A 172.217.16.78

;; Query time: 2 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: Mon Aug 02 11:32:38 CEST 2021 ;; MSG SIZE rcvd: 55

phonedph1 commented 3 years ago

Pretty sure we see this a lot too, but with IQUERY. We actually just drop all non-QUERY in dnsdist now - which I think would only affect the global counters and not the per-server ones which you might be more concerned about, so that could be an option.

laurretang commented 2 years ago

According to the way Cloudflare does, "igonores the OPCODE=2 and defaults to OPCODE=0" Is there any way to rewrite the OPCODE in recursor phase, and then send the "normal" query to auth servers? Further in similar, is there any way to set edns options in recursor phase, and then send the modified query to auth servers? Maybe we can add a lua_ffi function to achieve this.

Habbie commented 2 years ago

According to the way Cloudflare does, "igonores the OPCODE=2 and defaults to OPCODE=0" Is there any way to rewrite the OPCODE in recursor phase, and then send the "normal" query to auth servers? Further in similar, is there any way to set edns options in recursor phase, and then send the modified query to auth servers? Maybe we can add a lua_ffi function to achieve this.

Hello! You are asking several questions in a comment on a mostly unrelated ticket. Our GitHub issue tracking is not a support system.

If you have individual feature requests or bug reports, tickets are welcome.

Otherwise:

our GitHub issue tracker is for bug reports and feature requests. Your question looks like a support question. Support questions are handled in our other online communities: IRC and our mailing lists. Please see https://www.powerdns.com/opensource.html for information about those.

laurretang commented 2 years ago

According to the way Cloudflare does, "igonores the OPCODE=2 and defaults to OPCODE=0" Is there any way to rewrite the OPCODE in recursor phase, and then send the "normal" query to auth servers? Further in similar, is there any way to set edns options in recursor phase, and then send the modified query to auth servers? Maybe we can add a lua_ffi function to achieve this.

Hello! You are asking several questions in a comment on a mostly unrelated ticket. Our GitHub issue tracking is not a support system.

If you have individual feature requests or bug reports, tickets are welcome.

Otherwise:

our GitHub issue tracker is for bug reports and feature requests. Your question looks like a support question. Support questions are handled in our other online communities: IRC and our mailing lists. Please see https://www.powerdns.com/opensource.html for information about those.

OK, thanks for your reminds. I will do as your suggestions.