NLnetLabs / unbound

Unbound is a validating, recursive, and caching DNS resolver.
https://nlnetlabs.nl/unbound
BSD 3-Clause "New" or "Revised" License
3.03k stars 347 forks source link

Unbound 1.20 crashes in less than one hour, in libevent #1068

Open bortzmeyer opened 4 months ago

bortzmeyer commented 4 months ago

Describe the bug

I upgraded to Unbound 1.20 (a small resolver). After that, it crashes in less than one hour.

To reproduce Steps to reproduce the behavior:

  1. Run unbound
  2. Wait

Expected behavior No crash

System:

Configure line: --prefix=/usr --sysconfdir=/etc --localstatedir=/var --sbindir=/usr/bin --disable-rpath --enable-dnscrypt --enable-dnstap --enable-pie --enable-relro-now --enable-subnet --enable-systemd --enable-tfo-client --enable-tfo-server --enable-cachedb --with-libhiredis --with-conf-file=/etc/unbound/unbound.conf --with-pidfile=/run/unbound.pid --with-rootkey-file=/etc/trusted-key.key --with-libevent --with-libnghttp2 --with-pyunbound Linked libs: libevent 2.1.12-stable (it uses epoll), OpenSSL 3.3.0 9 Apr 2024 Linked modules: dns64 cachedb subnetcache respip validator iterator DNSCrypt feature available TCP Fastopen feature available

BSD licensed, see LICENSE in source package for details. Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues


**Additional information**

Downgrading to 1.19.3 apparently solved the problem. I also tried a 1.20 without `--with-libevent` but it also crashed.

A core dump is produced:

% gdb /usr/sbin/unbound.ORIG unbound-core GNU gdb (GDB) 14.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/unbound.ORIG...

This GDB supports auto-downloading debuginfo from the following URLs: https://debuginfod.archlinux.org Enable debuginfod for this session? (y or [n]) Debuginfod has been disabled. To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit. (No debugging symbols found in /usr/sbin/unbound.ORIG) [New LWP 35395] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Core was generated by `/usr/bin/unbound -d -p'. Program terminated with signal SIGSEGV, Segmentation fault.

0 0x00007404a8942976 in ?? () from /usr/lib/libevent-2.1.so.7

(gdb) where

0 0x00007404a8942976 in ?? () from /usr/lib/libevent-2.1.so.7

1 0x00007404a894a344 in event_base_loop () from /usr/lib/libevent-2.1.so.7

2 0x0000597888a39a06 in ?? ()

3 0x0000597888967dd2 in ?? ()

4 0x000059788897eb8e in ?? ()

5 0x00005978889602ea in ?? ()

6 0x00007404a8039c88 in ?? () from /usr/lib/libc.so.6

7 0x00007404a8039d4c in __libc_start_main () from /usr/lib/libc.so.6

8 0x0000597888960375 in ?? ()

(gdb) quit

bortzmeyer commented 4 months ago

unbound-core.gz unbound.ORIG.gz

gthess commented 4 months ago

Hi Stéphane, Could you build again and add in your configure line "CFLAGS=-g -O0" and then make install again? This will include debug symbols in the binary. Then you can run Unbound with gdb and when it crashes use bt full and paste the output here.

I suppose that when it is not built with libevent it still crashes but not in libevent, right?

bortzmeyer commented 4 months ago

With these compilation options, either unbound crashes without a core dump:

% sudo /usr/sbin/unbound -d -p
[About one hour of use]
zsh: segmentation fault  sudo /usr/sbin/unbound -d -p

Or it hangs, running at 100 % CPU time without answering queries. I attached to it with gdb and created a core file. gdb says:

(gdb) where
#0  0x00007046f51a7b27 in ?? () from /usr/lib/libevent-2.1.so.7
#1  0x00007046f51aa31f in event_base_loop () from /usr/lib/libevent-2.1.so.7
#2  0x0000625b2e517e80 in ub_event_base_dispatch (base=0x625b30207f70) at util/ub_event.c:280
#3  0x0000625b2e4f9970 in comm_base_dispatch (b=0x625b30207eb0) at util/netevent.c:282
#4  0x0000625b2e41d3a1 in worker_work (worker=0x625b301fba40) at daemon/worker.c:2325
#5  0x0000625b2e404624 in daemon_fork (daemon=0x625b301142d0) at daemon/daemon.c:809
#6  0x0000625b2e415506 in run_daemon (cfgfile=0x625b2e52fdd6 "/unbound.conf", cmdline_verbose=0, debug_mode=1, need_pidfile=0) at daemon/unbound.c:731
#7  0x0000625b2e41576b in main (argc=0, argv=0x7ffe31354618) at daemon/unbound.c:837
(gdb) generate-core-file c

unbound.gz unbound-core.gz

gthess commented 3 months ago

I still can't pinpoint to something at the moment. Could you:

  1. make clean
  2. Add also --enable-fully-static to the configure line
  3. make install