openwrt / openwrt

This repository is a mirror of https://git.openwrt.org/openwrt/openwrt.git It is for reference only and is not active for check-ins. We will continue to accept Pull Requests here. They will be merged via staging trees then into openwrt.git.
Other
20.34k stars 10.51k forks source link

FS#4189 - DHCP server detection in dnsmasq init script is unreliable #9169

Open openwrt-bot opened 2 years ago

openwrt-bot commented 2 years ago

Habbie:

Supply the following if possible:

============ Context

dnsmasq.init, as shipped in OpenWrt (any version I can find including master), tries to detect active DHCP servers on the network before configuring dnsmasq as a DHCP server.

============ Short

The 3 second udhcpc timeout in dnsmasq.init is too short and is likely to not notice another dnsmasq running on the same LAN.

============ Long This happens with the following line in dnsmasq.init:

    udhcpc -n -q -s /bin/true -t 1 -i "$ifname" >&- && rv=1 || rv=0

with these options, udhcpc gives up after 3 seconds if no response is received.

However, dnsmasq, with default settings (not passing --5/--no-ping), will ICMP ping probe an address before it hands it out. The timeout for that is "two to three seconds" - see "Q: Does the dnsmasq DHCP server probe addresses before allocating them, as recommended in RFC2131?" at https://thekelleys.org.uk/dnsmasq/docs/FAQ . This text also mentions that dnsmasq can only probe one address at a time, and thus only handle one DHCP request at a time. This means that it is very easy for the udhcpc call to falsely timeout because dnsmasq was not responding, or not responding quickly enough.

I noticed this in practice on a network with two dnsmasqs already running. The third one would randomly end up with DHCP enabled or disabled, depending on whether udhcpc timed out, or got a lease in just under 3 seconds.

============ What to do

Increase the udhcpc timeout?

Remove the whole check, as it is unreliable, and thus might surprise a user two weeks from now?

openwrt-bot commented 2 years ago

paulfertser:

Another idea to consider would be using DHCPINFORM packet instead, like dhcping -i does: https://salsa.debian.org/debian/dhcping/-/blob/debian/master/dhcping.c . However, it mentions that "DHCPINFORM packets can only be used on subnets the server is authoritative for. If the monitoring script runs on a subnet the server isn't authoritative for, it should use the DHCPREQUEST packets." I'd still prefer fast DHCPINFORM test to not having any test at all, and it should cover the majority of usecases reliably I guess.