Closed devedse closed 3 years ago
Unfortunately it happens here too. Definately something is wrong with latest release.
Running on Synology Docker in DSM 6.2
Can confirm for amd64 and arm64 platforms.
Same loop of
Starting pihole-FTL (no-daemon) as root
Stopping pihole-FTL
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
"Me too" "Same Here" and 👍🏻 comments don't help at all so please just use the icons for the original post.
Can confirm. I researched how to load an old image in Docker and couldn't find any help.
If anyone else needs a helping hand here (as a workaround):
sudo docker image ls
, look out for your old pihole
where the tag is unset. Note down the id and insert it into your docker run
command or docker-compose.yml
.
As @jgeusebroek mentioned, you could use pihole/pihole:v5.6
instead. Since my Pi uses pihole itself, I was unable to "download" the target image.
Sorry if this isn't helpful for the devs
Can confirm, same here. Docker-Container was updated by watchtower - didn't start afterwards.
docker exec -it pihole2 bash
root@d289e0149195:/# /usr/bin/pihole-FTL
bash: /usr/bin/pihole-FTL: Operation not permitted
pihole/pihole:v5.6 is the previous build. FYI.
pihole/pihole:v5.6 is the previous build. FYI. Yeah, this works perfectly - will keep using this until this issue is fixed
Could this be caused by https://github.com/pi-hole/FTL/commit/49ba60e9e0fb4439d8c8eb419daf71cc6d2c7d2b? Could we be running multiple instance during our init?
This workaround worked for me (assuming your docker container name is pihole
):
docker exec -it pihole /bin/bash
rm /dev/shm/FTL*
@PromoFaux The workaround by @gergnz suggests that /dev/shm
isn't clean in the docker image. Could you verify this?
I believe that as of that commit: https://github.com/pi-hole/FTL/commit/49ba60e9e0fb4439d8c8eb419daf71cc6d2c7d2b
It will crash when those files mentioned by @gergnz are present (/dev/shm/FTL*
)
Also we're using kill -9 <pid>
to shut down FTL. So it won't have the opportunity to clean up these files normally.
The main culprit being in /etc/cont-init.d/20-start.sh because it always runs before the service tries to start.
TL;DR we never did "clean" shutdowns. But since FLT v5.7 it's become necessary to do so.
@DL6ER I've just checked that the image doesn't have /dev/shm/FTL*
in the container image.
It's populated due to the /20-start.sh init script not shutting down cleanly.
Thanks. I'm not at all in how this container works, so I was just assuming what may be going wrong. Thanks for the research!
Whatever the solution will be, kill -9
shouldn't be part of it (anywhere!).
To confirm, the image doesn't contain lockfiles.
docker run --rm -ti --entrypoint="" pihole/pihole:v5.7 ls -la /dev/shm
total 0
drwxrwxrwt 2 root root 40 Feb 16 22:15 .
drwxr-xr-x 5 root root 360 Feb 16 22:15 ..
But if we look at this dir at the end of the init script it's apparently unclean.
# Dockerfile
FROM pihole/pihole:v5.7
RUN echo "ls -la /dev/shm" >> /etc/cont-init.d/20-start.sh
docker build -t pihole-locking .
docker run --rm -ti pihole-locking exit 0
...
[✓] Flushing DNS cache
[✓] Pi-hole Enabled
Pi-hole version is v5.2.4 (Latest: v5.2.4)
AdminLTE version is v5.4 (Latest: v5.4)
FTL version is v5.7 (Latest: v5.7)
total 768
drwxrwxrwt 2 root root 260 Feb 16 22:13 .
drwxr-xr-x 5 root root 360 Feb 16 22:13 ..
-rw------- 1 root root 356352 Feb 16 22:13 FTL-clients
-rw------- 1 root root 152 Feb 16 22:13 FTL-counters
-rw------- 1 root root 4096 Feb 16 22:13 FTL-dns-cache
-rw------- 1 root root 98304 Feb 16 22:13 FTL-domains
-rw------- 1 root root 48 Feb 16 22:13 FTL-lock
-rw------- 1 root root 24576 Feb 16 22:13 FTL-overTime
-rw------- 1 root root 4096 Feb 16 22:13 FTL-per-client-regex
-rw------- 1 root root 262144 Feb 16 22:13 FTL-queries
-rw------- 1 root root 12 Feb 16 22:13 FTL-settings
-rw------- 1 root root 4096 Feb 16 22:13 FTL-strings
-rw------- 1 root root 20480 Feb 16 22:13 FTL-upstreams
[cont-init.d] 20-start.sh: exited 0.
I should add: Cleaning up behind you is somewhat good practive and kill
ing a process hard (without giving it even the chance to clean up properly behind itself) is never.
FTL tries to open the shared memory objects and only fails with File exists
when this is not possible (in exclusive mode). FTL refuses to start as there may someone/-thing else using these files which would lead to serious issues if we'd start using them as well now.
This is actually a good thing, just the logic in the docker container has to be improved. I'm not familiar at all, but maybe the logic can be changed from
To only start when we actually want it.
OK, what appears to work (as a quickfix here) is adding rm /dev/shm/FTL*
before the kill -9
call in both finish
and 20-start.sh
This workaround worked for me (assuming your docker container name is
pihole
):docker exec -it pihole /bin/bash rm /dev/shm/FTL*
This works for me
I've been trying out a non-kill-9 approach, but it appears to be flaky.
# Kill dnsmasq because s6 won't like it if it's running when s6 services start
FTLPID=$(pgrep pihole-FTL)
kill ${FTLPID} || true
while kill -0 ${FTLPID}; do
echo "FTL ${FTLPID} still running..."
sleep 1
done
while [ -e "/dev/shm/FTL-lock" ]; do
echo "Lock file still exists..."
sleep 1
done
pihole -v
Usually this works as expected.
[i] Pi-hole blocking will be enabled
[i] Enabling blocking
[✓] Flushing DNS cache
[✓] Pi-hole Enabled
FTL 348 still running...
/var/run/s6/etc/cont-init.d/20-start.sh: line 27: kill: (348) - No such process
Pi-hole version is v5.2.4 (Latest: v5.2.4)
AdminLTE version is v5.4 (Latest: v5.4)
FTL version is v5.7 (Latest: v5.7)
But occasionally it will not clean up.
[i] Pi-hole blocking will be enabled
[i] Enabling blocking
[✓] Flushing DNS cache
[✓] Pi-hole Enabled
FTL 348 still running...
/var/run/s6/etc/cont-init.d/20-start.sh: line 27: kill: (348) - No such process
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
Lock file still exists...
So possibly a normal terminate signal still leaks the files?
Also adding rm /dev/shm/FTL*
before s6-setuidgid
in the run
script works.
Edit, in fact that's more or less what happens on a bare metal instance
https://github.com/pi-hole/pi-hole/blob/master/advanced/Templates/pihole-FTL.service#L25-L41
So possibly a normal terminate signal still leaks the files?
Shouldn't
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/main.c#L127
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/shmem.c#L515-L531
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/shmem.c#L743-L755
And man shm_unlink
says:
The operation of shm_unlink() is analogous to unlink(2): it removes a shared memory object name, and, once all processes have unmapped the object, de-allocates and destroys the contents of the associated memory region. After a successful shm_unlink(), attempts to shm_open() an ob‐ ject with the same name fail (unless O_CREAT was specified, in which case a new, distinct object is created).
This issue has been mentioned on Pi-hole Userspace. There might be relevant details there:
https://discourse.pi-hole.net/t/docker-update-to-v5-7-causes-ftl-to-crash-at-launch/44464/3
So possibly a normal terminate signal still leaks the files?
Shouldn't
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/main.c#L127
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/shmem.c#L515-L531
https://github.com/pi-hole/FTL/blob/2999e2b57c62b4455187ee9b77840d49df0a8e2e/src/shmem.c#L743-L755
And
man shm_unlink
says:The operation of shm_unlink() is analogous to unlink(2): it removes a shared memory object name, and, once all processes have unmapped the object, de-allocates and destroys the contents of the associated memory region. After a successful shm_unlink(), attempts to shm_open() an ob‐ ject with the same name fail (unless O_CREAT was specified, in which case a new, distinct object is created).
Nope @DL6ER I'm pretty sure it does, but seems unrelated to the kill -9
problem.
When I do have the situation when the normal kill still leaves files, this is in /var/log/pihole-FTL.log
[2021-02-16 23:07:34.161 349M] Received signal: Segmentation fault
[2021-02-16 23:07:34.161 349M] at address: 0x7f8831f0c9d0
[2021-02-16 23:07:34.161 349M] with code: SEGV_MAPERR (Address not mapped to object)
Some additional information which may or may not be relevant to this issue but is occurring also with the latest version:
docker: Error response from daemon: image with reference pihole/pihole was found but does not match the specified platform: wanted linux/arm/v7, actual: linux/amd64.
This is on a Raspberry Pi 4B running Raspbian and with both ":latest" and ":v5.7"
Some additional information which may or may not be relevant to this issue but is occurring also with the latest version:
docker: Error response from daemon: image with reference pihole/pihole was found but does not match the specified platform: wanted linux/arm/v7, actual: linux/amd64.
This is on a Raspberry Pi 4B running Raspbian and with both ":latest" and ":v5.7"
You probably pulled the x86/64 version of the image. As the latest on on my pi4 runs fine. (Minus the FTL issue)
Some additional information which may or may not be relevant to this issue but is occurring also with the latest version:
I'm going to hide these as off-topic, see the issue I noted.
I can confirm that #797 fixes this issue.
Thanks @thedanbob, I've just pulled :v5.7
locally and also working
Everyone here should now be able to re-pull :latest
or :v5.7
and be up and going. Apologies all for making things fall over!
(this is also why I don't personally let anything auto update - us dev types can cause havok somtimes! 😉)
My watchtower just updated PiHole to the latest version and I can confirm this issue is now resolved :smile:
This issue has been mentioned on Pi-hole Userspace. There might be relevant details there:
https://discourse.pi-hole.net/t/docker-update-to-v5-7-causes-ftl-to-crash-at-launch/44464/4
My watchtower just updated PiHole to the latest version and I can confirm this issue is now resolved 😄
I can confirm that, manual update of one (the problematic one) worked, automatic update with WatchTower of the other one worked as well (after resuming WatchTower from pause).
This issue has been mentioned on Pi-hole Userspace. There might be relevant details there:
https://discourse.pi-hole.net/t/pi-hole-wont-start-after-docker-update/44454/8
Thanks for fixing the problem. Works now in my Docker with latest version!
Hi Guys,
I have the same issue with the lastes release. Can't find how to solve it. Pleas help.
Lg
I am new to pihole and was installing it on an odroid with docker-compose, I also had this issue with pihole-FTL
[services.d] done. Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as pihole Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
My problem was that I had enabled the log volume - './var-log/pihole.log:/var/log/pihole.log' but I had missed to create the 'var-log' directory and touch './var-log/pihole.log'
So if anyone made the same mistake, sudo mkdir ./var-log sudo touch ./var-log/pihole.log
then start the with docker-compose and it works fine :) I also moved all mounts to '/opt/pihole'...
My problem was that I had enabled the log volume
- './var-log/pihole.log:/var/log/pihole.log' but I had missed to create the 'var-log' directory and touch './var-log/pihole.log'
I've actually removed this from the example file - because it's probably an unnecessary mount, and people often miss the part to create it first
Versions
``` Pi-hole version is v5.2.4 (Latest: v5.2.4) AdminLTE version is v5.4 (Latest: v5.4) FTL version is v5.7 (Latest: v5.7) ``` My watchtower just automatically updated my PiHole running on a raspberry pi: ``` time="2021-02-16T20:41:57Z" level=info msg="Found new pihole/pihole:latest image (sha256:a2eef2ddff91c7117eacfcfb6927ea56d5dd51291c2282e66d1fca4d7b2ba5ce)" time="2021-02-16T20:42:40Z" level=info msg="Stopping /Pihole (3e1fae648dfb6d3bbadc5aa28017cfd77248012a5fa08191900662c07bfdb9ed) with SIGTERM" time="2021-02-16T20:42:46Z" level=info msg="Creating /Pihole" ``` However after the PiHole containers shows up as Unhealthy. Here's the container log: ``` [s6-init] making user provided files available at /var/run/s6/etc...exited 0. [s6-init] ensuring user provided files have correct perms...exited 0. [fix-attrs.d] applying ownership & permissions fixes... [fix-attrs.d] 01-resolver-resolv: applying... [fix-attrs.d] 01-resolver-resolv: exited 0. [fix-attrs.d] done. [cont-init.d] executing container initialization scripts... [cont-init.d] 20-start.sh: executing... ::: Starting docker specific checks & setup for docker pihole/pihole [i] Installing configs from /etc/.pihole... [i] Existing dnsmasq.conf found... it is not a Pi-hole file, leaving alone! [i] Copying 01-pihole.conf to /etc/dnsmasq.d/01-pihole.conf... [✓] Copying 01-pihole.conf to /etc/dnsmasq.d/01-pihole.conf Converting DNS1 to PIHOLE_DNS_ Setting DNS servers based on PIHOLE_DNS_ variable ::: Pre existing WEBPASSWORD found DNSMasq binding to default interface: eth0 Added ENV to php: "PHP_ERROR_LOG" => "/var/log/lighttpd/error.log", "ServerIP" => "0.0.0.0", "VIRTUAL_HOST" => "0.0.0.0", Using IPv4 and IPv6 ::: Preexisting ad list /etc/pihole/adlists.list detected ((exiting setup_blocklists early)) https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts https://mirror1.malwaredomains.com/files/justdomains ::: Testing pihole-FTL DNS: FTL started! ::: Testing lighttpd config: Syntax OK ::: All config checks passed, cleared for startup ... ::: Enabling Query Logging [i] Enabling logging... [✓] Logging has been enabled! ::: Docker start setup complete [i] Neutrino emissions detected... [✓] Pulling blocklist source list into range [i] Preparing new gravity database... [✓] Preparing new gravity database [i] Using libz compression [i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts [i] Status: Pending... [✓] Status: Retrieval successful [i] Received 60887 domains [i] Target: https://mirror1.malwaredomains.com/files/justdomains [i] Status: Pending... [✗] Status: Not found [✗] List download failed: using previously cached list [i] Received 26854 domains [i] Storing downloaded domains in new gravity database... [✓] Storing downloaded domains in new gravity database [i] Building tree... [✓] Building tree [i] Swapping databases... [✓] Swapping databases [i] Number of gravity domains: 87741 (87713 unique domains) [i] Number of exact blacklisted domains: 0 [i] Number of regex blacklist filters: 0 [i] Number of exact whitelisted domains: 1 [i] Number of regex whitelist filters: 0 [i] Cleaning up stray matter... [✓] Cleaning up stray matter [✓] DNS service is listening [✓] UDP (IPv4) [✓] TCP (IPv4) [✓] UDP (IPv6) [✓] TCP (IPv6) [✓] Pi-hole blocking is enabled Pi-hole version is v5.2.4 (Latest: v5.2.4) AdminLTE version is v5.4 (Latest: v5.4) FTL version is v5.7 (Latest: v5.7) [cont-init.d] 20-start.sh: exited 0. [cont-init.d] done. [services.d] starting services Starting pihole-FTL (no-daemon) as root Starting lighttpd Starting crond [services.d] done. Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as root Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as root Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as root Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as root Stopping pihole-FTL kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Starting pihole-FTL (no-daemon) as root Stopping pihole-FTL ``` ### Platform - OS and version: Linux dockerpi 5.4.83-v7+ - Platform: Raspberry PI ### Expected behavior PiHole should start correctly ### Actual behavior / bug PiHole FTL service won't start ### Steps to reproduce Steps to reproduce the behavior: 1. Update to the latest version ## Edit I've also checked the FTL log found by executing `cat /var/log/pihole-FTL.log` inside the docker container. This is the output: ``` [2021-02-16 22:31:06.202 7444M] ########## FTL started! ########## [2021-02-16 22:31:06.203 7444M] FTL branch: master [2021-02-16 22:31:06.209 7444M] FTL version: v5.7 [2021-02-16 22:31:06.210 7444M] FTL commit: 2999e2b5 [2021-02-16 22:31:06.210 7444M] FTL date: 2021-02-16 19:36:43 +0000 [2021-02-16 22:31:06.210 7444M] FTL user: root [2021-02-16 22:31:06.210 7444M] Compiled for armv7hf (compiled on CI) using arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516 [2021-02-16 22:31:06.211 7444M] FATAL: create_shm(): Failed to create shared memory object "FTL-lock": File exists [2021-02-16 22:31:06.211 7444M] Initialization of shared memory failed. ```