Closed klutchell closed 2 years ago
Wonder if resolving this issue will just fill up my tmpfs and then what? Will the container restart or will the device hang?
Ugh - mine crashed, I assume for the same reason.
Is there a way to recover it remotely?
@eiddor yup, you should have VPN access via the dashboard still? It looks like you do.
Open a shell session into the pihole container and rm -rf /var/log/pihole/*
If the container isn't running, we can find that path on the host...
find /mnt/data/docker/overlay2 -name FTL.log -delete
I wonder if it's related to this commit? https://github.com/pi-hole/PADD/pull/235
The container appears to be running, but I can't connect to it -
I'm on the host, but find
on this shell doesn't like -delete
for some reason. I found it manually :-)
Deleted the file and rebooted, but now it's not coming back - I might have to recover it this weekend. :-(
@eiddor I can help recover this whenever you have time.
Currently it looks like the balena engine (aka docker daemon) is not running, so possibly a partial file from when it was out of space. When you have time you can check a couple things so we know what to try next.
Check the engine logs to see why it can't start:
journalctl -u balena
Check the space on the data partition
df -h
du -cksh /mnt/data/docker/*
Also, if you don't have any customized settings, ad lists, devices, etc (or if you have a recent backup) we could just purge the data dir and reboot.
rm /mnt/data/remove_me_to_reset
reboot
@klutchell Unfortunately the entire device did not come back after I deleted FTL.log and rebooted last night. It's showing offline in the dashboard and is not even pingable on my network. I suspect the full fs somehow corrupted the host, but that's purely a guess.
I'm going to have to flash a new sd card this weekend, I think.
Generally a full data partition should not be able to impact the hostOS, that's why we keep the rootfs as read-only, but it's possible something else is going on here.
Ok - Had to create a new device a flash a new card, so it's back online now (pinned to the -1 release with the old version of PADD.)
I'm remote from that device so I can't test on it, but I can add a second fleet/device and do some testing if you have some ideas.
(FWIW - It would be neat to be able to gen a new image for an existing device. Do you see a use case for that?)
400 MB to 12.3 GB overnight
400 MB to 12.3 GB overnight
@eiddor this is with the previous version of PADD? I haven't seen this issue since I reverted.
Can you capture some of the logs so we can see if it's the same message filling up the disk?
@eiddor this is with the previous version of PADD? I haven't seen this issue since I reverted.
Oh, sorry - I should have been clearer! I setup a test fleet with a local device and the current release, then I upgraded it to the new version of PADD and am seeing the same problems. Now I have something we can test with that I don't actually use for DNS.
Now I have something we can test with that I don't actually use for DNS.
I was going to set up a similar testing device but haven't had a chance yet. Though it seems you've confirmed the issue is only present when using the new version of PADD (maybe we mark that PR as draft for now)
Oops - Went to mark it as draft, and request your review instead :-)
Looks like this is being tracked here: https://github.com/pi-hole/PADD/issues/252
Specifically this post.
Nice find! I guess we can just wait it out, since I have very little time this week anyway!
Hopefully resolved by https://github.com/klutchell/balena-pihole/pull/167
My FTL logs are filling up with this at an incredible rate. GB per day. I hope it's only on my device but I need to get it sorted asap.
Will be logging some of my investigation here.