Closed DL6ER closed 5 years ago
This user reports to be affected by an FTL bug we fixed over half a year ago which is caused by using debug
mode in the docker containers. Debug mode is (solely) built to ease debugging and, as such, disables forking of the DNS resolver. This is, however, necessary for TCP sessions, initiated, e.g., by Netflix apps on devices like Playstation as these apps do never close the open session, i.e., conquer the DNS server until the timeout is eventually hit.
This issues is not present when running in normal (read this as anything but not debug mode) as, here, we allow the process to fork in order to spawn independent children who can deal with such unterminated TCP sessions as mandated by the corresponding RFCs.
Chiming in so I can follow here from pi-hole discourse in RE debug mode causing the log to stream to stdout. Also, I can anecdotally note that Netflix and at least a few other apps completely crater the pi-hole causing all kinds of issues. I thought at first there might be some blocking on the log (writing that much data to the SD card can bring the whole raspberrypi to the ground, but after reading the above comment, may be simply DNS being overwhelmed to the point of timeouts.
One other observation - it appears in this version that some DNS packets are simply getting dropped or never replied to. I thought this was related to my Cloudflared but this might also be a simple coincidence due to the above issue. Anyways, just a few additional data points!
may be simply DNS being overwhelmed to the point of timeouts
No, it is not an overwhelming, but it is the problem that an app can really conquer the DNS server for itself. This occupation does automatically drop after a given timeout, but this is a fallback mechanism for the case that the client app died without releasing the connection.
it appears in this version that some DNS packets are simply getting dropped or never replied to
This will inevitably happen when an app occupies the DNS server over the TCP connection in debug mode as FTL cannot respond to new (and old!) requests that are outside of this single connection - regardless if through UDP or another TCP connection.
I will add that I was chasing down the exact same issue as in the Medium article last week. I am running in Kubernetes with pihole and cloudflared running together in a pod. Prior to seeing the problem I had migrated from the diginc/pi-hole docker image to pihole/pihole using latest (4.1 at the time).
I also noticed a correlation with api-global.netflix.com and eventually saw that FTL was running in debug mode which led me to this GitHub issue.
I reverted to tag 4.0.0-1_amd64 of pihole/pihole and have been stable since the switch. I confirmed that FTL is not running in debug mode in this version.
Will be fixed by https://github.com/pi-hole/docker-pi-hole/pull/380
Just to add the root cause I mentioned in the other ticket, FTL seemed to crash out when ran in non-debug mode due to not having the right linux permissions / capabilities.
Having docker pass setcap
power of NET_ADMIN seemed to fix the containerized tests: --cap-add=NET_ADMIN
, all the docker tests were bombing out simply by changing debug to no-daemon until I added NET_ADMIN.
I realized this because I went to run an strace inside the container to see why it would suddenly just stop in no-daemon mode and that required passing more privileges, I used all privs with docker run --privileged
and as soon as I did that it stopped reproducing the problem.
I wasn't aware of that pihole-FTL
crashes when permissions are only partially applied. I'll mark this as an FTL bug in this case and will provide a fix when I can reproduce the crash.
Here is a way to reproduce, first the working version:
$ docker run -it --rm --entrypoint='pihole-FTL' --cap-add=NET_ADMIN pihole/pihole:4.1.1 no-daemon
[2019-01-02 14:40:58.116] Using log file /var/log/pihole-FTL.log
[2019-01-02 14:40:58.116] ########## FTL started! ##########
[2019-01-02 14:40:58.117] FTL branch: master
[2019-01-02 14:40:58.117] FTL version: v4.1.2
[2019-01-02 14:40:58.117] FTL commit: b06eedf
[2019-01-02 14:40:58.117] FTL date: 2018-12-21 14:43:34 -0600
[2019-01-02 14:40:58.117] FTL user: root
[2019-01-02 14:40:58.117] WARNING: Starting pihole-FTL as user root is not recommended
[2019-01-02 14:40:58.117] Starting config file parsing (/etc/pihole/pihole-FTL.conf)
[2019-01-02 14:40:58.117] SOCKET_LISTENING: only local
[2019-01-02 14:40:58.117] AAAA_QUERY_ANALYSIS: Show AAAA queries
[2019-01-02 14:40:58.117] MAXDBDAYS: max age for stored queries is 365 days
[2019-01-02 14:40:58.118] RESOLVE_IPV6: Resolve IPv6 addresses
[2019-01-02 14:40:58.118] RESOLVE_IPV4: Resolve IPv4 addresses
[2019-01-02 14:40:58.118] DBINTERVAL: saving to DB file every minute
[2019-01-02 14:40:58.118] DBFILE: Using /etc/pihole/pihole-FTL.db
[2019-01-02 14:40:58.118] MAXLOGAGE: Importing up to 24.0 hours of log data
[2019-01-02 14:40:58.118] PRIVACYLEVEL: Set to 0
[2019-01-02 14:40:58.118] IGNORE_LOCALHOST: Show queries from localhost
[2019-01-02 14:40:58.118] BLOCKINGMODE: Null IPs for blocked domains
[2019-01-02 14:40:58.118] REGEX_DEBUGMODE: Inactive
[2019-01-02 14:40:58.118] ANALYZE_ONLY_A_AND_AAAA: Disabled. Analyzing all queries
[2019-01-02 14:40:58.118] DBIMPORT: Importing history from database
[2019-01-02 14:40:58.118] PIDFILE: Using /var/run/pihole-FTL.pid
[2019-01-02 14:40:58.118] PORTFILE: Using /var/run/pihole-FTL.port
[2019-01-02 14:40:58.118] SOCKETFILE: Using /var/run/pihole/FTL.sock
[2019-01-02 14:40:58.118] WHITELISTFILE: Using /etc/pihole/whitelist.txt
[2019-01-02 14:40:58.118] BLACKLISTFILE: Using /etc/pihole/black.list
[2019-01-02 14:40:58.118] GRAVITYFILE: Using /etc/pihole/gravity.list
[2019-01-02 14:40:58.118] REGEXLISTFILE: Using /etc/pihole/regex.list
[2019-01-02 14:40:58.118] SETUPVARSFILE: Using /etc/pihole/setupVars.conf
[2019-01-02 14:40:58.118] AUDITLISTFILE: Using /etc/pihole/auditlog.list
[2019-01-02 14:40:58.118] Finished config file parsing
[2019-01-02 14:40:58.118] INFO: No whitelist file found
[2019-01-02 14:40:58.118] Compiled 0 Regex filters and 0 whitelisted domains in 0.0 msec (0 errors)
[2019-01-02 14:40:58.118] db_init() - Cannot open database (14): unable to open database file
[2019-01-02 14:40:58.118] Creating new (empty) database
[2019-01-02 14:40:58.154] Database successfully initialized
[2019-01-02 14:40:58.154] Imported 0 queries from the long-term database
[2019-01-02 14:40:58.154] -> Total DNS queries: 0
[2019-01-02 14:40:58.154] -> Cached DNS queries: 0
[2019-01-02 14:40:58.154] -> Forwarded DNS queries: 0
[2019-01-02 14:40:58.154] -> Exactly blocked DNS queries: 0
[2019-01-02 14:40:58.154] -> Unknown DNS queries: 0
[2019-01-02 14:40:58.154] -> Unique domains: 0
[2019-01-02 14:40:58.154] -> Unique clients: 0
[2019-01-02 14:40:58.155] -> Known forward destinations: 0
[2019-01-02 14:40:58.155] Successfully accessed setupVars.conf
[2019-01-02 14:40:58.269] PID of FTL process: 1
[2019-01-02 14:40:58.269] Listening on port 4711 for incoming IPv4 telnet connections
[2019-01-02 14:40:58.269] Listening on Unix socket
Due to the ctrl+c not getting passed correctly you may need to kill the container from another terminal when it works correctly.
The same docker run without --cap-add=NET_ADMIN
has the exact same output but abruptly ends the process after Listening on Unix socket
The same can be tested interactively with: docker run -it --rm --entrypoint='bash' pihole/pihole:4.1.1
but as I mentioned when I tried running with strace I had to add privs and that fixed the issue, so maybe we need to just pass strace privileges (if they don't overlap with the fix) or run strace from outside docker on the docker run and have it follow the process forks (possible I think?)
The tail of sudo strace docker run -it --rm --entrypoint='pihole-FTL' pihole/pihole:4.1.1 no-daemon
:
[2019-01-02 15:02:54.438] Creating new (empty) database
) = 0
futex(0x25380c8, FUTEX_WAIT, 0, NULL) = 0
futex(0x25380c8, FUTEX_WAIT, 0, NULL[2019-01-02 15:02:54.468] Database successfully initialized
[2019-01-02 15:02:54.469] Imported 0 queries from the long-term database
[2019-01-02 15:02:54.469] -> Total DNS queries: 0
[2019-01-02 15:02:54.469] -> Cached DNS queries: 0
[2019-01-02 15:02:54.469] -> Forwarded DNS queries: 0
[2019-01-02 15:02:54.469] -> Exactly blocked DNS queries: 0
[2019-01-02 15:02:54.469] -> Unknown DNS queries: 0
[2019-01-02 15:02:54.469] -> Unique domains: 0
[2019-01-02 15:02:54.469] -> Unique clients: 0
[2019-01-02 15:02:54.469] -> Known forward destinations: 0
[2019-01-02 15:02:54.470] Successfully accessed setupVars.conf
[2019-01-02 15:02:54.589] PID of FTL process: 1
[2019-01-02 15:02:54.589] Listening on port 4711 for incoming IPv4 telnet connections
[2019-01-02 15:02:54.589] Listening on Unix socket
) = 0
futex(0x25380c8, FUTEX_WAIT, 0, NULL <unfinished ...>
+++ exited with 5 +++
Pi-hole FTL v4.3 (currently still in development
) will ship the capabilities check. We don't fail hard if we detect missing capabilities as we've seen systems (operating systems as well as file systems) with lack of support for capabilities in the past. Here, the fallback solution of running pihole-FTL
as root
until we have bound to the required ports and only then falling back to an unprivileged user works, so I don't want to destroy support for those platforms.
I don't believe CAP_NET_ADMIN
should be required, especially if the DHCP server is not in use. Even with the DHCP server enabled, injection in the arp table is not required if using the dhcp-broadcast
option. Pings and CAP_NET_RAW
can be skipped with the no-ping
option.
The only really required capability is CAP_NET_BIND_SERVICE
to bind to ports lower than 1024 (if binding after root has been dropped).
I personally have been running a bare dnsmasq docker instance by doing setuid and changing capabilities manually before launching. But the best would be to enhance the checks in FTL or upstream dnsmasq.
The capabilities code in dnsmasq is here: https://github.com/pi-hole/FTL/blob/d42bcb7ab914a1d64fb8121d862c40043cc25197/dnsmasq/dnsmasq.c#L616-L630
I think the check should be gated on the features and options in use, the same way CAP_NET_BIND_SERVICE
already is. No DHCP server (or dhcp-broadcast
/ no-ping
), don't try to keep CAP_NET_ADMIN
and/or CAP_NET_RAW
.
Here is the Dockerfile and related files I use for my dnsmasq container: https://gist.github.com/mhofman/cdd85a6baa4b9206830b254d0ab9bb89
By changing the user before launching dnsmasq, all the checks are skipped the same way they are when running in debug mode.
If dnsmasq itself was to selectively retain capabilities, CAP_NET_BIND_SERVICE
wouldn't be needed to bind to the DNS port, since that's done before dropping root.
Thanks you very much for your detailed post. We do not modify the embedded dnsmasq
so your request should be directed towards Simon on the dnsmasq-discuss mailing list if you are willing to write such a mail.
I perfectly agree with you that the capabilities should be reduced as much as possible, but, e.g., no-ping
might now always be a safe option - think about an already running network where you just enabled dnsmasq
as your DHCP server. It might hand out duplicated IP addresses with the no-ping
option enabled until all devices have renewed their leases at least once on this server.
Here is the email I wrote to the mailinglist: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2019q1/012840.html
I agree that the no-ping
option is unsafe. However this would be something users can manually configure if needed. But the NET_RAW
capability is not really an issue for most people as it's included in the default docker capabilities.
The problematic one is NET_ADMIN
which isn't there by default, and is in my opinion too powerful of a capability to grant to most containers.
Since changes would have to come from the upstream dnsmasq, in the mean time maybe @diginc can include a workaround similar to mine in the pihole docker image? The capabilities check in the current development branch could be used to warn the user before enabling the DHCP server? And optionally set the dhcp-broadcast
option if the user starts the image with the DHCP server enabled and without the NET_ADMIN
capability?
The problem has been fixed. When running in debug
mode (as opposed to no-daemon
mode), we prevent dropping from root
to a less privileged user. FTL v4.2.1 will set the default user to drop to to root
to prevent this from happening on systems that lack proper Linux capabilities support.
@dl6er I'm not sure I followed what is changing. Are you using dnsmasq built-in user change or are you dropping root in FTL before hitting dnsmasq's code?
If the former, dnsmasq will try and keep the net_admin and net_raw capability even if not actually required.
If the later, you could drop root while keeping only the required & available capabilities, and not have dnsmasq fail in setpcap calls.
@mhofman The problem is a different one - you could even understand it as being entirely separated from dnsmasq
(although it isn't). I will explain this:
When we start pihole-FTL
, it will start as root
on those systems that don't fully support Linux capabilities (as in we cannot use the setcap
command). On startup, FTL will create shared memory objects to store its data in. When dnsmasq
starts, it sees UID==0
and decides to drop to an unprivileged user while keeping the capabilities as you mentioned. However, this unprivileged user - the entire pihole-FTL
process now runs as - will now not be able to access the shared memory objects we created earlier and hence the process will die the next time it want to access its own memory (only root
can read/write it).
The change we made is to ensure that dnsmasq
will "drop" to user root
to ensure that the shared memory objects stay read- and writable, see https://github.com/pi-hole/FTL/pull/497/commits/1404ea167945fcbffa80198505352a799b0d4557.
Right, sorry I originally failed to see how this was related to running the docker image.
From what I gather, docker image is still running FTL as root, correct?
Is there any reason why the docker image doesn't call setcap
with CAP_NET_BIND_SERVICE,CAP_NET_RAW,CAP_NET_ADMIN+ei
as a "filesystem fix" on first launch? That should allow it to change users.
Please note that I recommend not setting the p
flag since that would cause an launch failure for containers not having the extra capabilities available. Any capabilities still in the bounding set will be inherited when switching users, and will then become permitted thanks to the file capabilities.
As I wrote before a "safer" approach would be to check the config files and make sure no dnsmasq feature is in use that relies on those capabilities, but that can be a future improvement. In the mean-time you can put a warning that not all capabilities are available and this is "unsupported".
Here is a summary of what I'm proposing:
Now:
setcap CAP_NET_BIND_SERVICE,CAP_NET_RAW,CAP_NET_ADMIN+ei usr/bin/pihole-FTL
on first container launch.pihole-FTL
process. The permitted capabilities of the process will be whatever the intersection is between the container's capabilities and the 3 capabilities above. By default this will be CAP_NET_BIND_SERVICE,CAP_NET_RAW
unless the user launches the container with cap_add: net_admin
.In the future:
Closing this issue as v4.2.1 has been released. It addresses the issue raised here. If the discussion needs to be continued, it may be done here in the closed issue ticket or rather in the surroundings of our docker repository.
Internal discussion - I'm aware of a violation of the issue template, however, this ticket mostly serves to ensure reproducibility rather than directly ask anyone to fix it (I will most likely fix it myself during the discussion).
This discussion starts with https://github.com/pi-hole/docker-pi-hole/pull/370#pullrequestreview-183866786 where @diginc mentioned that he found it necessary to run
pihole-FTL debug
in order to keeppihole-FTL
in foreground.As result of this mentioning, @Mcat12 proposed a solution for FTL and mentioned that he found it necessary to run the
dnsmasq
code indebug
mode. This should be avoided for daily use cases as it has several drawbacks, most relevant for Netflix users as we already know from previous investigations that TCP requests are important for these clients. However, indebug
mode,dnsmasq
does not fork at all for new TCP workers and deadlocks (or at least considerable delays of up to one minute or so) can easily arise.I tried to reproduce the issue locally on the latest build in the
master
branch, i.e., the code we released as v4.1, but I wasn't able to reproduce it:Terminal 1:
(
pihole-FTL
stays in foreground and blocks the terminal from here on)Terminal 2: