chrisss404 / powerdns

PowerDNS dnsdist, recursor, authoritative, and admin interface. Supports DNSCrypt, DoH, and DoT.
https://hub.docker.com/r/chrisss404/powerdns
MIT License
53 stars 20 forks source link

Recursor status is down #15

Closed Appendme closed 1 year ago

Appendme commented 1 year ago

Using the example Private Authoritative Server I get the down status of recursor in webui dnsdist, and there is also an entry a.root-servers.net/A in the table Servfail domain in webui recursor.

If I do as written here https://github.com/chrisss404/powerdns/issues/10#issuecomment-813121431 then the recursor will start working

My goal: I have two windows server DNS servers DC1 and DC2 and I want to add forwarding to pdns to receive static records added through admin webui

chrisss404 commented 1 year ago

I get the down status of recursor in webui dnsdist

I can not reproduce this behaviour using the referenced example. What I would try:

The recursor's IP should be fixed by following section in docker-compose.yml:

recursor:
    ipv4_address: 172.31.117.117

The default dnsdist config can be found here: https://github.com/chrisss404/powerdns/blob/master/dnsdist/conf/conf.d/servers.conf

Query example.com

$ dig @127.0.0.1 -p 1053 example.com

; <<>> DiG 9.18.17 <<>> @127.0.0.1 -p 1053 example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38676
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.com.           IN  A

;; ANSWER SECTION:
example.com.        86400   IN  A   93.184.216.34

;; Query time: 147 msec
;; SERVER: 127.0.0.1#1053(127.0.0.1) (UDP)
;; WHEN: Sun Sep 24 18:08:37 CEST 2023
;; MSG SIZE  rcvd: 56

Query test.sys configured via pdns

$ dig @127.0.0.1 -p 1053 test.sys

; <<>> DiG 9.18.17 <<>> @127.0.0.1 -p 1053 test.sys
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37631
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;test.sys.          IN  A

;; ANSWER SECTION:
test.sys.       60  IN  A   10.0.0.1

;; Query time: 6 msec
;; SERVER: 127.0.0.1#1053(127.0.0.1) (UDP)
;; WHEN: Sun Sep 24 18:08:34 CEST 2023
;; MSG SIZE  rcvd: 53

I want to add forwarding to pdns to receive static records added through admin webui

The recursor is configured to perform DNSSEC validation in this example, so you might want to turn it off by setting the environment variable _RECURSORDNSSEC to off or you need to enable dnssec for your TLD and configure the trust anchor in the recursor correspondingly using the environment variable _RECURSOR_TRUSTANCHORS.

HTH & BR Christian

dnsdist recursor

Appendme commented 1 year ago

Thanks for the answer. I checked it now on another computer and it works. Perhaps the problem is in the old docker on the machine where this problem occurs. I'll check my guess a little later.

Appendme commented 1 year ago

Docker update didn't help. I tried to run it on another server and it got the same error as on the first one. From the previous answer, run on a computer with debian, the rest where it was not possible to run it on ubuntu, this is the only thing that distinguishes them from my point of view.

Here is my config: https://gist.github.com/Appendme/f690b6b82320978bcfb2e57481a43681

dnsdist can access to recursor:

>docker compose exec dnsdist sh
/ # apk add bind-tools
/ # dig @172.31.117.117 example.com
;; communications error to 172.31.117.117#53: timed out
;; communications error to 172.31.117.117#53: timed out

; <<>> DiG 9.18.19 <<>> @172.31.117.117 example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 14459
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.com.                   IN      A

;; Query time: 0 msec
;; SERVER: 172.31.117.117#53(172.31.117.117) (UDP)
;; WHEN: Mon Sep 25 09:17:28 UTC 2023
;; MSG SIZE  rcvd: 40

/ # ping 172.31.117.117
PING 172.31.117.117 (172.31.117.117): 56 data bytes
64 bytes from 172.31.117.117: seq=0 ttl=64 time=0.121 ms
64 bytes from 172.31.117.117: seq=1 ttl=64 time=0.086 ms
64 bytes from 172.31.117.117: seq=2 ttl=64 time=0.049 ms
64 bytes from 172.31.117.117: seq=3 ttl=64 time=0.105 ms
^C
--- 172.31.117.117 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.049/0.090/0.121 ms

image recusor logs

chrisss404 commented 1 year ago

dnsdist can access to recursor:

Not on port 53, there is a communication error and the query status is SERVFAIL instead of NOERROR:

;; communications error to 172.31.117.117#53: timed out
;; communications error to 172.31.117.117#53: timed out

This is what it looks like in my dnsdist container:

$ docker-compose -f private-authoritative.yml exec dnsdist sh
/ # apk add bind-tools
/ # dig @172.31.117.117 example.com

; <<>> DiG 9.18.19 <<>> @172.31.117.117 example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26566
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.com.           IN  A

;; ANSWER SECTION:
example.com.        86353   IN  A   93.184.216.34

;; Query time: 0 msec
;; SERVER: 172.31.117.117#53(172.31.117.117) (UDP)
;; WHEN: Mon Sep 25 15:06:04 UTC 2023
;; MSG SIZE  rcvd: 56

From what you describe the only thing that comes to mind is that it might be related to firewall (iptables) rules. I would try to compare the iptables rules between your debian and ubuntu host using: /sbin/iptables -L -n

HTH & Good luck

Appendme commented 1 year ago

On a worked server forward policy ACCEPT, but this did not help

Appendme commented 1 year ago

I think the problem is Bad file descriptor errors on recursor: logs

chrisss404 commented 1 year ago

I think the problem is Bad file descriptor errors on recursor: logs

This could also be the reason, it definitely shouldn't be there, see my logs: recursor.log

You can also try using one of the release versions instead of latest, e.g.:

-image: chrisss404/powerdns:latest-recursor
+image: chrisss404/powerdns:4.9.1-recursor
Appendme commented 1 year ago

There may be a problem with access to the root dns, later I’ll try to change the healthcheck in dnsdist.

Appendme commented 1 year ago

Yes, the problem was in health check due to problems with access to the root dns. Thanks for answering.

chrisss404 commented 1 year ago

Great that you resolved your issue, can you share how you were able to identify the root cause of not being able to resolve a.root-servers.net/A on your host?

In case someone else runs into a similar issue, this is how you can adapt the dnsdist healthcheck:

Appendme commented 1 year ago

In the recursor container, I noticed that requests to powerdns.com were dropping, after using dig with the +trace parameter I noticed some strangeness: first, requests go to tld and gltd, they seem to pass, but then they seem to fall off... In general, I changed the health check host in dnsdist and that’s it it worked

chrisss404 commented 1 year ago

Thx, for your answer.

It seems that your recursor is not working properly as it is unable to fulfill one of its main purposes, namely resolving top level domains. If you don't want to resolve other domains than the ones configured in your authoritative server then you might not need to have a recursor at all.