pi-hole / FTL

The Pi-hole FTL engine
https://pi-hole.net
Other
1.37k stars 193 forks source link

Dynamic-Host on dnsmasq is broken #1531

Closed eSoares closed 1 year ago

eSoares commented 1 year ago

Versions

The latest working version was:

After that version (in particular the current marked in docker as "latest") has this issue.

Platform

Expected behavior

When configure a custom dynamic-host in pihole-config-folder/dnsmasq/99-dynamic-hosts.conf the queries should reply with the IP configured in that config. Example config:

dynamic-host=mydomain.com,100.64.0.1,tailscale0
dynamic-host=mydomain.com,192.168.1.1,eth0

Which should translate any query to mydomain.com coming from tailscale0 to be answer with 100.64.0.1. That it the behavior in the version mentioned above. This was tested in a external device, connected via the interface tailscale0 and made the request:

➜  ~ nslookup domain.com 100.64.0.1
Server:     100.64.0.1
Address:    100.64.0.1#53

Name:   domain.com
Address: 100.64.0.1

Actual behavior / bug

In latest version, the configured dynamic-host gets answer as:

➜  ~ nslookup domain.com 100.64.0.1
Server:     100.64.0.1
Address:    100.64.0.1#53

Name:   domain.com
Address: 192.168.1.1
PromoFaux commented 1 year ago

I'm not intimate enough with FTL/dnsmasq's inner workings to know what has changed here, but just for information - if you are saying this last worked in 2022.10, then the change would have happened from 2022.11 onwards.

@DL6ER - that included https://github.com/pi-hole/FTL/pull/1469, not sure if that is a helpful place to start looking

DL6ER commented 1 year ago

Here are my current results on the most recent FTL release (which should be broken):

Configuration in /etc/dnsmasq.d/001-test.conf

dynamic-host=mydomain.com,127.0.1.123,lo
dynamic-host=mydomain.com,192.168.2.123,enp2s0
dynamic-host=mydomain.com,192.168.4.123,wg0

With this, I get locally on lo (which is 127.0.0.0/8):

pi-hole $ dig mydomain.com @127.0.0.1 +short
127.0.1.123

With this, I get locally on enp2s0 (which is 192.168.2.0/24):

pi-hole $ dig mydomain.com @192.168.2.11 +short
192.168.2.123

With this, I get remotely (from a different country, even) on wg0 (which is 192.168.4.0/24):

laptop $ dig mydomain.com @192.168.4.1 +short
192.168.4.123

This is looking perfectly fine to me so this might rather be a dockerrelated networking issue here (kicking the ball kind of back to @PromoFaux). I'm not currently in the position to test docker-based Pi-hole but let me ask if you have any logs (thinking about /var/log/pihole/pihole.log in particular) for us. Ideal would be a log with the old version (where it is working) and a recent one (where it does not work). But if we can get only recent logs that'll already be very helpful. I'm still on a business trip so my debugging capabilities are severely limited, however, my last test above (querying from remote through my Wireguard tunnel), should actually come pretty close to your test scenario?

Two final remarks:

eSoares commented 1 year ago

Thanks for the fast response!

  1. Yes, I have localise-queries enable at 01-pihole.conf.
  2. The issue is indeed having the warning no addresses found for interface ...! My interface in question, has the following output from ip a s:
    4: tailscale0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1280 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none
    inet 100.64.0..../32 scope global tailscale0
       valid_lft forever preferred_lft forever
    inet6 fd7a:....:...::3/128 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::....:....:...:..../64 scope link stable-privacy
       valid_lft forever preferred_lft forever

The container with the Pi-hole is running under host network mode. How can I solve this issue?

Edit: The top of my /var/log/pihole/pihole.log in the last version:

Feb 14 15:13:05 dnsmasq[262]: started, version pi-hole-v2.89 cachesize 10000
Feb 14 15:13:05 dnsmasq[262]: compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack ipset no-nftset auth cryptohash DNSSEC loop-detect inotify dumpfile
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq[262]: warning: no addresses found for interface tailscale0
Feb 14 15:13:05 dnsmasq-dhcp[262]: DHCP, IP range 192.168.1.2 -- 192.168.1.250, lease time 1d
Feb 14 15:13:05 dnsmasq[262]: using nameserver 8.8.8.8#53
Feb 14 15:13:05 dnsmasq[262]: using nameserver 8.8.4.4#53
Feb 14 15:13:05 dnsmasq[262]: using nameserver 208.67.222.222#53
Feb 14 15:13:05 dnsmasq[262]: using nameserver 208.67.220.220#53
Feb 14 15:13:05 dnsmasq[262]: using nameserver 1.1.1.1#53
Feb 14 15:13:05 dnsmasq[262]: using nameserver 1.0.0.1#53
Feb 14 15:13:05 dnsmasq[262]: using only locally-known addresses for onion
Feb 14 15:13:05 dnsmasq[262]: using only locally-known addresses for bind
Feb 14 15:13:05 dnsmasq[262]: using only locally-known addresses for invalid
Feb 14 15:13:05 dnsmasq[262]: using only locally-known addresses for localhost
Feb 14 15:13:05 dnsmasq[262]: using only locally-known addresses for test
Feb 14 15:13:05 dnsmasq[262]: read /etc/hosts - 7 names
Feb 14 15:13:05 dnsmasq[262]: read /etc/pihole/custom.list - 3 names
Feb 14 15:13:05 dnsmasq[262]: read /etc/pihole/local.list - 0 names
DL6ER commented 1 year ago

The issue is the interface configuration: Mind the /32 in

inet 100.64.0..../32

saying: This network contains exactly one address.

Compare to the other interfaces on your system, e.g. 127.0.0.1/8 (saying the network is 127.0.0.1 - 127.255.255.255) or 192.168.0.1/24 (which is 192.168.0.1 - 192.168.0.255).

This has indeed been changed four months ago and FTL v5.19 is indeed the first affected by this. The prior behavior (accepting networks which contain only a single address) was fixed by Simon Kelley, the original creator and maintainer of the dnsmasq server we embed into Pi-hole.

How can I solve this issue?

The interface configuration of your tailscale0 interface is incorrect. The bug was that dnsmasq previously handled "empty" networks incorrectly.

eSoares commented 1 year ago

Indeed it makes sense to skip interfaces that say there is only that device in the network, and nothing else behind that interface.

Still, on the example that I have, the IPv6 address fe80.... is a /64, shouldn't dnsmasq still listen on that address range at least?

DL6ER commented 1 year ago

in the example that I have, the IPv6 address fe80.... is a /64, shouldn't dnsmasq still listen on that address range at least?

Why? Your config specifies only an IPv4 address (= A RR), why should dnsmasq care about the IPv6 subnet in this case?

eSoares commented 1 year ago

Found the cause of the issue! https://github.com/tailscale/tailscale/issues/7340#issuecomment-1440840917

From the tailscale side, they indicate dnsmasq is in the wrong, since single-host address should be expected to still be able to route traffic to other hosts.

Can this behavior of dnsmasq be reversed or option to disable it added with some flag?

dschaper commented 1 year ago

That would be addressed with Simon et al at https://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

DL6ER commented 1 year ago

From the tailscale side, they indicate dnsmasq is in the wrong, since single-host address should be expected to still be able to route traffic to other hosts.

They say that tailscale does not configure a "traditional" network with well-defined subnet(s) but instead is a peer-to-peer network with arbitrary peers solely defined by routes. For the operating system, the network interface is on its own (say at 192.168.46.33) but it may define routes for completely arbitrary networks (say "10.34.43.12 is reachable through 192.168.46.33). I disagree with their statement "it's common and normal to assign singleton IP addresses to interfaces" as the contrary is true in my long experience with both open source (Wireguard, OpenVPN, strongSwan) and also all the enterprise ($$$) solutions I know from businesses. But our horizons may simply not overlap and they refer to other products that might be doing this.

They say dnsmasq is wrong but this isn't the case. Looking again at the man page:

Add A, AAAA and PTR records to the DNS in the same subnet as the specified interface. The address is derived from the network part of each address associated with the interface, and the host part from the specified address.

we can clearly see that dnsmasq is behaving exactly the way it should. This interface simply doesn't have any real subnet. While the way tailscale sets up their interfaces may be functionally okay, it is just incompatible with this option. Why do you even need it and cannot resort to using "normal" host-records?

eSoares commented 1 year ago

we can clearly see that dnsmasq is behaving exactly the way it should.

Is behaving the way is documented, not necessarily the way it should. If a DNS query comes from a given interface, the subnet of that interface should not be dnsmasq concern (from a OSI layering perspective).

Why do you even need it and cannot resort to using "normal" host-records?

Is there any way to define host-records that resolve in a IP different IP depending on the interface where the query is received? From what I could see, that was the reason for the dynamic-host option.

DL6ER commented 1 year ago

Is there any way to define host-records that resolve in a IP different IP depending on the interface where the query is received?

Pi-hole localizes replies by default (dnsmasq option localise-queries).

Assume you have four devices on your Pi-hole:

and you specify in /etc/hosts on your Pi-hole (or local DNS records or anywhere else):

127.0.0.123 mydevice
192.168.0.45 mydevice
10.145.2.1 mydevice

Then your Pi-hole will always respond with the most appropriate address whenever this is possible and dig mydevice will return

I'm aware that - while this is the answer to your particular question - it is still not exactly what you want as your/their interpretation of "depending on the interface a query is received" is not like the generally accepted "within the interface's subnet" but rather "any address the interface has a route to". Don't get me wrong, I can see a justification for the latter but it is just not how things typically work and a lot harder to code. Take, for instance, iptables as a popular networking software. It also works based on subnets not by inspecting the routes attached to an interface.


I do see that doing this not by looking at the subnet but at all the routes attached to an interface could be equivalent, but it is much more effort to code it like this and a lot more work at run-time as "does this match any route defined for any interface" is more than "does it fit the subnet bitmask for this interface". Routes either have to be queried at run-time and then updated frequently as peers come and go or we risk missing peers.

TL;DR: Sorry for the long text but I want to make it very clear why I say dnsmasq behaves exactly as documented and - compared to other networking software out there - also how it should instead of bluntly saying things like: "I am right, you are wrong!". What you are requesting is an altogether new feature as dnsmasq simply has no way right now to know that IP 152.1.55.45 is reachable via an interface with 10.155.1.2/32 (even when a route exists that says so).

DL6ER commented 1 year ago

There is actually a discussion about this going on on the dnsmasq mailing list right now. You may want to chime in and present your view, too.

DL6ER commented 1 year ago

A change has been made upstream in the dnsmasq project to prevent /32 subnets from being ignored. I understand Simon Kelley's reasoning more in terms of "because it broke earlier behavior" rather than "because this is how it should be" but this doesn't count too much in the end.

You can change to the bleeding-edge dnsmasq branch to see if this fixes the issue also on your Pi-hole using

sudo pihole checkout ftl update/dnsmasq

Please make sure to go back to master using

sudo pihole checkout ftl master

after the next release to ensure you are back in sync with the releases. You could also stay on the branch but please be aware that things may break here as it follows dnsmasq development closely so regressions may only be detected later during the testing phase. However, development is typically stable and the quality is typically very high so the risks of something braking is low in reality.

If you are running Pi-hole in a docker container, you need to run something like

docker container exec -it <container_name> pihole checkout ftl update/dnsmasq

instead.

DL6ER commented 1 year ago

The next version of FTL has been released. Please update and run

pihole checkout master

to get back on-track. The fix/feature branch you switched to will not receive any further updates.

Thanks for helping us to make Pi-hole better for us all!

If you have any issues, please either reopen this ticket or (preferably) create a new ticket describing the issues in further detail and only reference this ticket. This will help us to help you best.