PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.7k stars 909 forks source link

dnsdist segfault (DoH) #7810

Closed appliedprivacy closed 5 years ago

appliedprivacy commented 5 years ago

Short description

We run a public DoH service and switched to dnsdist today. Minutes after pointing real traffic to it, it crashed (and got started again by systemd).

Environment

Steps to reproduce

We have no reproducer.

Expected behaviour

dnsdist should run

Actual behaviour

May 12 21:12:08  kernel: dnsdist/doh[11118]: segfault at 7fe0440af5a8 ip 0000560e7c0f1d98 sp 00007fe0405fccd8 error 7 in dnsdist[560e7bdeb000+327000]
May 12 21:12:08  kernel: Code: de 97 cf ff e9 a1 00 d7 ff 48 89 c5 e9 af 00 d7 ff 48 89 c5 e9 b7 00 d7 ff 48 89 c5 e9 bf 00 d7 ff 90 48 8b 07 48 85 c0 74 08 <48> c7 40 58 00 00 00 00 c3 90 66 66 2e 0f 1f 84 00 00 00 00 00 0f
May 12 21:12:08  systemd[1]: dnsdist.service: Main process exited, code=killed, status=11/SEGV
May 12 21:12:08  systemd[1]: dnsdist.service: Failed with result 'signal'.
May 12 21:12:10  systemd[1]: dnsdist.service: Service RestartSec=2s expired, scheduling restart.
May 12 21:12:10  systemd[1]: dnsdist.service: Scheduled restart job, restart counter is at 1.
May 12 21:12:10  systemd[1]: Stopped DNS Loadbalancer.
May 12 21:12:10  systemd[1]: Starting DNS Loadbalancer...
May 12 21:12:10  dnsdist[18854]: Configuration '/etc/dnsdist/dnsdist.conf' OK!
[...]
May 12 21:51:25  kernel: dnsdist/doh[18861]: segfault at 6d6f6303cc ip 0000561d9bb12d98 sp 00007f8a420dbcd8 error 6 in dnsdist[561d9b80c000+327000]
May 12 21:51:25  kernel: Code: de 97 cf ff e9 a1 00 d7 ff 48 89 c5 e9 af 00 d7 ff 48 89 c5 e9 b7 00 d7 ff 48 89 c5 e9 bf 00 d7 ff 90 48 8b 07 48 85 c0 74 08 <48> c7 40 58 00 00 00 00 c3 90 66 66 2e 0f 1f 84 00 00 00 00 00 0f
May 12 21:51:25  systemd[1]: dnsdist.service: Main process exited, code=killed, status=11/SEGV
May 12 21:51:25  systemd[1]: dnsdist.service: Failed with result 'signal'.
May 12 21:51:27  systemd[1]: dnsdist.service: Service RestartSec=2s expired, scheduling restart.
May 12 21:51:27  systemd[1]: dnsdist.service: Scheduled restart job, restart counter is at 2.
May 12 21:51:27  systemd[1]: Stopped DNS Loadbalancer.
May 12 21:51:27  systemd[1]: Starting DNS Loadbalancer...
May 12 21:51:27  dnsdist[31378]: Configuration '/etc/dnsdist/dnsdist.conf' OK!
[...]

Other information

config:

newServer({address="127.0.0.1"})
addDOHLocal("37.252.185.229", "/etc/letsencrypt/live/doh.appliedprivacy.net/fullchain.pem", "/etc/letsencrypt/live/doh.appliedprivacy.net/privkey.pem", "/query", {ciphers='ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256'})
addDOHLocal("2a00:63c1:a:229::2", "/etc/letsencrypt/live/doh.appliedprivacy.net/fullchain.pem", "/etc/letsencrypt/live/doh.appliedprivacy.net/privkey.pem", "/query", {ciphers='ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256'})
setACL({'0.0.0.0/0', '::/0'})
controlSocket('127.0.0.1:5199')
setConsoleACL('127.0.0.1/8')
setKey("-removed-")
rgacogne commented 5 years ago

Thanks for reporting this! Would you be able to provide a core dump, a backtrace or perhaps a network capture of the incoming queries right before the crash?

appliedprivacy commented 5 years ago
Thread 5 "dnsdist/doh" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff3abd700 (LWP 8310)]
on_generator_dispose (_self=0x7fffec02c4a0) at doh.cc:275
275     doh.cc: No such file or directory.
(gdb) backtrace
#0  on_generator_dispose (_self=0x7fffec02c4a0) at doh.cc:275
#1  0x00007ffff6ffb94c in h2o_mem_clear_pool () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#2  0x00007ffff702f14b in h2o_http2_stream_close () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#3  0x00007ffff702890f in ?? () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#4  0x00007ffff6ffe5d7 in ?? () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#5  0x00007ffff6ffe862 in ?? () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#6  0x00007ffff700087d in h2o_evloop_run () from /usr/lib/x86_64-linux-gnu/libh2o-evloop.so.0.13
#7  0x0000555555956484 in dohThread (cs=0x555555b31540) at /usr/include/c++/8/bits/shared_ptr_base.h:1018
#8  0x00007ffff6c05b2f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007ffff6cddfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#10 0x00007ffff68e34cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
rgacogne commented 5 years ago

Thank you for the backtrace, that was very helpful! We fixed a bug that looks a lot like this one in #7814,and although we can't be 100% sure it was the same bug, it seems quite likely. The PR has been merged and we are automatically building updated packages right now. Would you by any chance be able to test the updated version when it's available? Thanks again!

appliedprivacy commented 5 years ago

using the buster-dnsdist-master repo:

apt install dnsdist
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 dnsdist : Depends: libprotobuf17 but it is not going to be installed
           Depends: libre2-5 (>= 20131024+dfsg) but it is not going to be installed
           Depends: libsnmp30 (>= 5.7.3+dfsg) but it is not going to be installed
           Depends: libstdc++6 (>= 6) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

@zeha pointed out on the dnsdist mailing list that this issue is known and tracked as #7781