PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.66k stars 906 forks source link

pdns_server with GeoIP Backend does not answer #11533

Closed lvasiliev closed 8 months ago

lvasiliev commented 2 years ago

Short description

We use PowerDNS authoritative with GeoIPBackend and copy every minute via ssh zones.yml and execute pdns_control reload after. Everything works well, but sometimes pdns_server does not responding to DNS queries.

Environment

Other information

We got backtrace at that moment:

Executable module set to "/usr/local/sbin/pdns_server".
Architecture set to: x86_64--freebsd12.3.
(lldb) bt
* thread #1, name = 'pdns_server'
  * frame #0: 0x0000000801a998cc libthr.so.3`___lldb_unnamed_symbol192$$libthr.so.3 + 92
    frame #1: 0x0000000801a96e6b libthr.so.3`___lldb_unnamed_symbol161$$libthr.so.3 + 491
    frame #2: 0x0000000801a0f8a2 libc++.so.1`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
    frame #3: 0x00000008019cf8eb libc++.so.1`std::__1::__shared_mutex_base::lock_shared() + 91
    frame #4: 0x0000000802b14a35 libgeoipbackend.so`GeoIPBackend::lookup(QType const&, DNSName const&, int, DNSPacket*) [inlined] std::__1::shared_mutex::lock_shared() at shared_mutex:195:70
    frame #5: 0x0000000802b14a29 libgeoipbackend.so`GeoIPBackend::lookup(QType const&, DNSName const&, int, DNSPacket*) [inlined] std::__1::shared_lock<std::__1::shared_mutex>::shared_lock(this=<unavailable>) at shared_mutex:329
    frame #6: 0x0000000802b14a29 libgeoipbackend.so`GeoIPBackend::lookup(QType const&, DNSName const&, int, DNSPacket*) [inlined] ReadLock::ReadLock(this=<unavailable>) at lock.hh:118
    frame #7: 0x0000000802b14a29 libgeoipbackend.so`GeoIPBackend::lookup(QType const&, DNSName const&, int, DNSPacket*) [inlined] ReadLock::ReadLock(this=<unavailable>, lock=<unavailable>) at lock.hh:107
    frame #8: 0x0000000802b14a29 libgeoipbackend.so`GeoIPBackend::lookup(this=0x00000008023d39c0, qtype=0x00007fffffffe3e0, qdomain=0x0000000808abc098, zoneId=-1, pkt_p=0x0000000000000000) at geoipbackend.cc:431                                                                                                                                                                                                                 
    frame #9: 0x000000000127988b pdns_server`DNSBackend::getSOA(this=0x00000008023d39c0, domain=0x0000000808abc098, sd=0x00007fffffffe4b0) at dnsbackend.cc:245:9
    frame #10: 0x0000000802b165bf libgeoipbackend.so`GeoIPBackend::getAllDomains(this=0x00000008023d39c0, domains=<unavailable>, getSerial=<unavailable>, include_disabled=<unavailable>) at geoipbackend.cc:860:11                                                                                                                                                                                                                 
    frame #11: 0x000000000147bea2 pdns_server`UeberBackend::updateZoneCache(this=0x00007fffffffe670) at ueberbackend.cc:287:11
    frame #12: 0x000000000125aa0b pdns_server`mainthread() at common_startup.cc:747:11
    frame #13: 0x0000000001423c3c pdns_server`main(argc=<unavailable>, argv=0x00007fffffffeae8) at receiver.cc:677:5
    frame #14: 0x0000000001204ed2 pdns_server`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76:7
(lldb) detach
Process 59314 detached

After detach we see in /var/log/messages:

Apr 13 16:08:00 geons14 pdns[59314]: 5001 questions waiting for database/backend attention. Limit is 5000, respawning
mind04 commented 2 years ago

Disabling the zone cache is most likely a workaround for your issue.

Related #11416

lvasiliev commented 2 years ago

Thank! We have set the option zone-cache-refresh-interval=0 and are testing again.

esacs2004 commented 2 years ago

When I use zone-cache-refresh-interval=0 I'm suddenly unable to query multiple records at all. Debug logs are not helpful. Does anyone know why PDNS would not properly respond to queries after a reload?

Habbie commented 8 months ago

Thank! We have set the option zone-cache-refresh-interval=0 and are testing again.

Given the age of this ticket, I'm going to assume this resolved it, and I will close the ticket.

When I use zone-cache-refresh-interval=0 I'm suddenly unable to query multiple records at all. Debug logs are not helpful. Does anyone know why PDNS would not properly respond to queries after a reload?

If you still have this problem, please open a thread in Discussions, thanks!