Closed dryya closed 1 year ago
There seems to be a loop in the service file, in that the Wants seems to reference the stuff in the Before, for network-online and also for nss-lookup target. Perhaps the sensible approach would be to fill in the supposed answers here, unbound starts when the network target is done, and this is completed before the network-online target is reached. And also before nss-lookup, to have unbound up before nss-lookup intends to do queries.
This sort of depends on the meaning of the targets and also other systemd set up. Perhaps this change could be good?
diff --git a/contrib/unbound.service.in b/contrib/unbound.service.in
index ada5fac9..5a05c525 100644
--- a/contrib/unbound.service.in
+++ b/contrib/unbound.service.in
@@ -42,9 +42,8 @@
[Unit]
Description=Validating, recursive, and caching DNS resolver
Documentation=man:unbound(8)
-After=network-online.target
-Before=nss-lookup.target
-Wants=network-online.target nss-lookup.target
+After=network.target
+Before=network-online.target nss-lookup.target
[Install]
WantedBy=multi-user.target
I can confirm that this works for me on two machines (one using systemd-networkd and one with no network manager, just iwd) - unbound is up and running in three seconds! (I attempted something similar on my own, but I realize now it failed because the standard systemctl edit
command won't remove previous Before entries, but instead adds on to them.) Thanks for the quick response!
That fixed it for me as well!
The fix is committed to the repo. That should improve the systemd integration scripts for Unbound!
Maybe it's something related to this commit that when I restart the server the unbound service fails because the ipv6 network still hasn't come up.
unbound[364]: [1673554420] unbound[364:0] error: can't bind socket: Cannot assign requested address for 2001:db8:0:2::2 port 53 unbound[364]: [1673554420] unbound[364:0] fatal error: could not open ports systemd[1]: unbound.service: Main process exited, code=exited, status=1/FAILURE systemd[1]: unbound.service: Failed with result 'exit-code'. systemd[1]: Failed to start Validating, recursive, and caching DNS resolver.
I need to restart de service to bring it up: systemctl restart unbound.service
/etc/systemd/network/ens18.network [Match] Name=ens18
[Address] Address=192.168.0.2/24
[Address] Address=2001:db8:0:2::2/64
[Network] Gateway=192.168.0.1 Gateway=2001:db8:0:2::1 DHCP=no ConfigureWithoutCarrier=Yes
Hi @wcawijngaards,
I've encountered an issue where the Unbound service fails to restart on boot, which may be related to the issue you've addressed.
TL;DR: After=network.target doesn't guarantee that interfaces are ready when Unbound attempts to bind to them. Changing the configuration to After=network-online.target
appears to be the correct fix.
Details: I have a custom dummy interface with IP 10.1.1.1, and Unbound cannot bind to it during boot time because the interface isn't ready yet. I fixed this issue by modifying the unit file (I'm using Unbound 1.rocky8 and Unbound 1.16.2) to this:
[Unit]
Description=Unbound recursive Domain Name Server
After=network.target
After=network-online.target # This is the line I added
Before=nss-lookup.target
Before I changed the unit file (After=network.target
), unbound cannot start at boot time:
Jul 08 15:07:19 sre-pdns-primary systemd[1]: Starting Unbound recursive Domain Name Server...
Jul 08 15:07:19 sre-pdns-primary unbound-checkconf[844]: unbound-checkconf: no errors in /etc/unbound/unbound.conf
Jul 08 15:07:19 sre-pdns-primary systemd[1]: Started Unbound recursive Domain Name Server.
Jul 08 15:07:19 sre-pdns-primary unbound[855]: [1720415239] unbound[855:0] error: can't bind socket: Cannot assign requested address for 10.1.1.1 port 53
Jul 08 15:07:19 sre-pdns-primary unbound[855]: [1720415239] unbound[855:0] fatal error: could not open ports
Jul 08 15:07:19 sre-pdns-primary systemd[1]: unbound.service: Main process exited, code=exited, status=1/FAILURE
Jul 08 15:07:19 sre-pdns-primary systemd[1]: unbound.service: Failed with result 'exit-code'.
After I changed the unit file (After=network-online.target
)
Jul 08 15:10:11 sre-pdns-primary systemd[1]: Starting Unbound recursive Domain Name Server...
Jul 08 15:10:11 sre-pdns-primary unbound-checkconf[2484]: unbound-checkconf: no errors in /etc/unbound/unbound.conf
Jul 08 15:10:11 sre-pdns-primary systemd[1]: Started Unbound recursive Domain Name Server.
Jul 08 15:10:11 sre-pdns-primary unbound[2489]: [1720415411] unbound[2489:0] debug: chdir to /etc/unbound
Jul 08 15:10:11 sre-pdns-primary unbound[2489]: [1720415411] unbound[2489:0] debug: drop user privileges, run as unbound
Jul 08 15:10:11 sre-pdns-primary unbound[2489]: [1720415411] unbound[2489:0] debug: switching log to /var/log/unbound/unbound.log
According to RHEL's documentation, network.target means that the service for setting up the network has started but doesn't guarantee that it's ready. In contrast, network-online.target
is only reached after the network is connected, which seems to be the appropriate option for this use case.
In most cases, the current setting works because interfaces are up faster than Unbound tries to bind to them. However, there's a chance that interfaces become slow, causing Unbound not to start at boot time. Many users modify their own systemd unit file to fix this (it's more likely to happen with custom interfaces). Changing After=network.target
to After=network-online.target
may address the root cause of this issue.
Not facing the problem for ipv4. But for ipv6 the root cause seems to be DAD. A workaround is to disable it with: net.ipv6.conf.xxx.accept_dad = 0
The commit https://github.com/NLnetLabs/unbound/commit/d43760a8cd7d01f59fd73bf7edbf983903d8a142 adds the network-online.target
to the contrib/unbound.service.in and contrib/unbound_portable.service.in unit files. Another workaround for avoiding the problem could be to set ip-freebind: yes
, that allows using interfaces that are down, or ip-transparent: yes
, by the way.
Describe the bug
As described in this arch linux bug report, "unbound waits for the network to be on (as stipulated in its service file) and systemd waits for the DNS resolver to be up before declaring that the network is on. The cycle only breaks when systemd network initialization times out and finally the unbound service file is allowed to start." The behavior started to occur with commit afbc7bb4fec5026f6a1a1487e643b94b2ba1d694 . Unbound and the network still work perfectly fine afterwards, it's just that DNS resolution doesn't come up until after the timeout period for systemd's network target.
To reproduce
On arch linux enable the systemd-networkd and unbound systemd services. Systemd-resolved is disabled. I don't believe it's relevant but I included a minimal resolvconf config file too.
Some more information on what's happening via systemd logs:
Output from
❯ systemctl status systemd-networkd-wait-online.service
:And you can see via
journalctl --boot
unbound only begins afterwards:System:
OS: Linux arch 6.0.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 26 Oct 2022 15:25:45 +0000 x86_64 GNU/Linux
unbound -V
output:Configure line: --prefix=/usr --sysconfdir=/etc --localstatedir=/var --sbindir=/usr/bin --disable-rpath --enable-dnscrypt --enable-dnstap --enable-pie --enable-relro-now --enable-subnet --enable-systemd --enable-tfo-client --enable-tfo-server --enable-cachedb --with-libhiredis --with-conf-file=/etc/unbound/unbound.conf --with-pidfile=/run/unbound.pid --with-rootkey-file=/etc/trusted-key.key --with-libevent --with-libnghttp2 --with-pyunbound Linked libs: libevent 2.1.12-stable (it uses epoll), OpenSSL 1.1.1q 5 Jul 2022 Linked modules: dns64 cachedb subnetcache respip validator iterator DNSCrypt feature available TCP Fastopen feature available
BSD licensed, see LICENSE in source package for details. Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues