Open appliedprivacy opened 5 years ago
Thanks for reporting. Did you compile getdns/stubby yourself? Would it be possible for you to compile latest (so version 1.5.2) of getdns/stubby with debugging symbols?
Did you compile getdns/stubby yourself?
stubby has been installed using apt https://packages.debian.org/buster/stubby
Would it be possible for you to compile latest (so version 1.5.2) of getdns/stubby with debugging symbols?
Do you sign releases? I didn't find *.asc files. https://github.com/getdnsapi/stubby/releases
Releases are announced and published here: https://getdnsapi.net/releases/getdns-1-5-2/ I also synchronize this with github releases here: https://github.com/getdnsapi/getdns/releases asc file is below the star button ;)
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7d5f980 in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
(gdb) backtrace
#0 0x00007ffff7d5f980 in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#1 0x00007ffff7d7ed56 in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#2 0x00007ffff7d4a6cc in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#3 0x00007ffff7d4b66b in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#4 0x00007ffff7d4b89a in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#5 0x00007ffff7d4beb2 in getdns_general () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#6 0x0000555555557fce in incoming_request_handler (context=0x555555560260, callback_type=GETDNS_CALLBACK_COMPLETE, request=0x5555555c94d0, userarg=0x0, request_id=93824992630912)
at ./../stubby/src/stubby.c:684
#7 0x00007ffff7d54c7c in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#8 0x00007ffff7d70c83 in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#9 0x00007ffff7d70e1d in ?? () from /usr/lib/x86_64-linux-gnu/libgetdns.so.10
#10 0x0000555555558e6b in main (argc=4, argv=0x7fffffffe568) at ./../stubby/src/stubby.c:1021
Hmmm... it doesn't look like the getdns was compiled with debugging symbols... Did you configure like this?:
$ ./configure CFLAGS=-g --with-stubby
Also, could you do a
$ ls -l /usr/lib/x86_64-linux-gnu/libgetdns.so.10
? For getdns-1.5.2, it should point to libgetdns.so.10.1.2
sorry that was my fault (LD_LIBRARY_PATH was not set to include the correct folder)
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7d7a393 in _getdns_rbtree_first (rbtree=0x555555560bd8) at ./util/rbtree.c:553
553 for (node = rbtree->root; node->left != RBTREE_NULL; node = node->left);
(gdb) backtrace
#0 0x00007ffff7d7a393 in _getdns_rbtree_first (rbtree=0x555555560bd8) at ./util/rbtree.c:553
#1 0x00007ffff7d3e4ad in _getdns_netreq_change_state (netreq=0x5555555d8e40, new_state=NET_REQ_FINISHED) at ./general.c:380
#2 0x00007ffff7d4efa7 in upstream_read_cb (userarg=0x55555558d358) at ./stub.c:1561
#3 0x00007ffff7d6c3d3 in poll_read_cb (fd=5, event=0x55555558d438) at ./extension/poll_eventloop.c:295
#4 0x00007ffff7d6cb66 in poll_eventloop_run_once (loop=0x555555560c60, blocking=1) at ./extension/poll_eventloop.c:445
#5 0x00007ffff7d6ce80 in poll_eventloop_run (loop=0x555555560c60) at ./extension/poll_eventloop.c:499
#6 0x00007ffff7d5f758 in getdns_context_run (context=0x555555560260) at ./context.c:3742
#7 0x0000555555558e6b in main (argc=4, argv=0x7fffffffe528) at ./../stubby/src/stubby.c:1021
Thanks. Does it crash at the same point every time?
I didn't collect many backtraces, but here is another:
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) backtrace
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7b79535 in __GI_abort () at abort.c:79
#2 0x00007ffff7bd0508 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7cdb28d "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3 0x00007ffff7bd6c1a in malloc_printerr (str=str@entry=0x7ffff7cdd018 "double free or corruption (!prev)") at malloc.c:5341
#4 0x00007ffff7bd873c in _int_free (av=0x7ffff7d12c40 <main_arena>, p=0x5555555c02c0, have_lock=<optimized out>) at malloc.c:4309
#5 0x00007ffff7d4370b in _getdns_dns_req_free (req=0x5555555c02d0) at ./request-internal.c:648
#6 0x00007ffff7d5e9d8 in _getdns_context_cancel_request (dnsreq=0x5555555c02d0) at ./context.c:3255
#7 0x00007ffff7d5eb76 in _getdns_context_request_timed_out (dnsreq=0x5555555c02d0) at ./context.c:3305
#8 0x00007ffff7d3d8fc in _getdns_check_dns_req_complete (dns_req=0x5555555c02d0) at ./general.c:113
#9 0x00007ffff7d3e25f in _getdns_check_expired_pending_netreqs (context=0x555555560260, now_ms=0x7fffffffdea8) at ./general.c:324
#10 0x00007ffff7d5a920 in _getdns_check_expired_pending_netreqs_cb (arg=0x555555560260) at ./context.c:1305
#11 0x00007ffff7d6c444 in poll_timeout_cb (event=0x555555560bf8) at ./extension/poll_eventloop.c:314
#12 0x00007ffff7d6ce28 in poll_eventloop_run_once (loop=0x555555560c60, blocking=1) at ./extension/poll_eventloop.c:485
#13 0x00007ffff7d6ce80 in poll_eventloop_run (loop=0x555555560c60) at ./extension/poll_eventloop.c:499
#14 0x00007ffff7d5f758 in getdns_context_run (context=0x555555560260) at ./context.c:3742
#15 0x0000555555558e6b in main (argc=4, argv=0x7fffffffe528) at ./../stubby/src/stubby.c:1021
Ok thanks... stubby is talking to unbound-1.9.2?
yes, stubby is talking to unbound 1.9.2 (on FreeBSD)
Could you also try disabling DNSSEC validation within stubby?
It did run for 40 minutes without a crash with DNSSEC validation disabled. It crashed within a minute after enabling DNSSEC validation again.
Ok, so... since you control and also authenticate the upstream DoT speaking unbound. Perhaps, for the moment it is okay to let that unbound do the dnssec validation and let stubby just trust the answers... while I try to reproduce and solve this issue...
Is there any update on this issue?
What approx. amount of funding would you require to provide a fix for this issue and release a new version including the fix?
Sorry for not responding earlier. This codepath is also subject to change with a planned refactor of upstream scheduling (add DoH upstreams in the mix). Also, I'm willing to have another look to see if I can get a quick workaround of some sort, but this will be possible only after 19th of august.
Thanks for the heads up, we would still be interested in the required funding since we are applying for funding ourselves and it would help us to have more accurate estimations for future fixes like this.
Maybe we can discuss coordination of fixes (and possible funding) in person? Are you going to ietf107?
Setup
Short intro about our setup:
We run a public DoH service and use dnsdist, today we added stubby to our chain since dnsdist does not support DoT to communicate with backends yet.
nginx -> dnsdist -> stubby -> unbound
(unbound does not run on the same server and so we encrypt using stubby/DoT)
Stubby runs on 127.0.0.1:5353 and dnsdist detected frequent issues:
by default stubby apparently does not log anything so we edited the systemd service file to start stubby with
-v 2
to get some logging:Log events showing crashes
They are not necessarily about a single bug but maybe multiple distinct bugs.
gdb backtrace
There are no debug symbols in the installed version of stubby but maybe it is useful nontheless:
OS: Debian Buster
Configuration