pi-hole / FTL

The Pi-hole FTL engine
https://pi-hole.net
Other
1.36k stars 194 forks source link

Activating DNSSEC causes crash on prebuilt FTL binaries (arm architecture) #244

Closed OneRainbowDev closed 6 years ago

OneRainbowDev commented 6 years ago

In raising this issue, I confirm the following (please check boxes, eg [X]) Failure to fill the template will close your issue:

How familiar are you with the codebase?:

1

[BUG | ISSUE] Expected Behaviour: Daemon should remain running

[BUG | ISSUE] Actual Behaviour: Crashes after some time in which it works perfectly. Also brings down the web interface, refreshing just constantly tries to reload, however restarting pihole-FTL solves this issue. EDIT: This is actually related to gdb

[BUG | ISSUE] Steps to reproduce: Start pihole-FTL via systemctl and wait for it to crash after some time.

Fresh raspbian stretch install, with pihole installed first, then the core/web FTLDNS was activated with:

echo "FTLDNS" | sudo tee /etc/pihole/ftlbranch
pihole checkout core FTLDNS 
pihole checkout web FTLDNS

Log file output [if available]

gdb output: Thread 1 "pihole-FTL" received signal SIGILL, Illegal instruction. _nettle_sha512_compress () at sha512-compress.s:124 124 sha512-compress.s: No such file or directory.

backtrace:

0 _nettle_sha512_compress () at sha512-compress.s:124

1 0x00629750 in nettle_sha512_update (ctx=0x2597590, length=264, data=0x1d4df6c "\001")

at sha512.c:150

2 0x00000000 in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Device specifics

Hardware Type: Raspberry Pi OS: Raspbian Stretch

This template was created based on the work of udemy-dl.

AzureMarker commented 6 years ago

What version of the Raspberry Pi?

OneRainbowDev commented 6 years ago

Generation 1 Model B - Arm6 CPU

AzureMarker commented 6 years ago

Run pihole -d for a debug token. This might be related to the hoops we have to jump through to compile the arm version of FTLDNS. Raspberry Pis use an armv6 CPU with hardware float support, which is not very common. Usually it's armv7 with hard float, called armhf by most distros including Debian. Because of this, we have to compile everything that goes into the arm build with the Raspbian compiler. It might be that the specific library (nettle) that is compiled in was not correctly compiled. We'll look into this, but the debug token will let us know more about your system so we can reproduce it and find the cause of the crash.

OneRainbowDev commented 6 years ago

My token, apologies for the delay: 2d5gl404ua

Also pihole -up is broken, should I make a separate issue for this?

On a separate SD card, installing pihole then FTL with just pihole checkout dev works fine, not sure if the branches are different.

OneRainbowDev commented 6 years ago

Done some more testing with another SD card, it seems that it may have just been a weird one off bug.

Flashed raspbian, upgraded and then installed pihole/FTL. Run now for 4-5 hours without any issues. Have the binaries been changed in the last 3 days?

DL6ER commented 6 years ago

Have the binaries been changed in the last 3 days?

As you can see here, the last change of the binaries was on March 23

DL6ER commented 6 years ago

@EngieDev You said you tested with another SD card, so was this a new install? If so, have you used DNSSEC in the earlier installation and are you using it now again?

OneRainbowDev commented 6 years ago

@DL6ER Yep, new install. I was using DNSSEC before, but not currently. I'll activate it and see what happens.

OneRainbowDev commented 6 years ago

@DL6ER Aha, it crashed within 2 minutes of activation, so restarted, connected up gdb and the same error occurs, so it likely to be something to do with DNSSEC.

Sorry for forgetting about activating DNSSEC..

AzureMarker commented 6 years ago

So, it breaks on the development version of FTL when activating DNSSEC too?

OneRainbowDev commented 6 years ago

DNSSEC breaks on the FTLDNS branch, will check the development version now. I'm assuming that these:

pihole checkout core FTLDNS 
pihole checkout web FTLDNS

refer to FTLDNS branch and pihole checkout dev refers to development.

OneRainbowDev commented 6 years ago

Development version seems stable, been a few hours now without issues.

DL6ER commented 6 years ago

Okay, so this is expected and seems related to the inbuilt nettle library. You said you are using

Generation 1 Model B - Arm6 CPU

Is this really model "B" or "B+"? I only have a first gen B+ and there everything runs smooth (apparently, they share the same CPU).

OneRainbowDev commented 6 years ago

Model "B" according to proc/cpuinfo/.

Would it be worth downloading and building/installing the FTLDNS branch on the raspi?

DL6ER commented 6 years ago

Building FTLDNS on the Raspberry Pi Gen 1 takes almost an entire hour. You can still try to do this if you like. The steps are (roughly):

  1. Install building dependencies (sudo apt install libgmp-dev m4)
  2. Download libnettle (https://ftp.gnu.org/gnu/nettle/nettle-3.4.tar.gz)
  3. Build libnettle (./configure && make && make install)
  4. Download the source code of FTL (git clone https://github.com/pi-hole/FTL.git)
  5. Check out branch FTLDNS (git checkout FTLDNS)
  6. Compile and install FTLDNS (make && make install)

Feel free to ask any questions. I'd appreciate any feedback concerning the enumeration given above (things I missed, etc.).

I may have a rough idea why it crashes for you and are currently looking into this.

OneRainbowDev commented 6 years ago

I gave it a go, successfully built version v3.0-62-gc5f34a4 (If that means anything). Took about an hour in all.

Your instructions were accurate, just lacking the minor steps. The full command list I used is below from the log:

sudo apt-get update
sudo apt install libgmp-dev m4
wget https://ftp.gnu.org/gnu/nettle/nettle-3.4.tar.gz
tar -xvzf nettle-3.4.tar.gz
cd nettle-3.4
./configure
make
sudo make install 
cd ..
git clone https://github.com/pi-hole/FTL.git
cd FTL
git checkout FTLDNS
make
sudo make install
sudo service pihole-FTL restart

Cleanup

cd ..
rm FTL -r
rm nettle-3.4 -r

Runs fine, so will activate DNSSEC and see what happens.

OneRainbowDev commented 6 years ago

Running with DNSSEC activated seems to be fine... no crashes/errors. So is it something to do with the cross-building?

DL6ER commented 6 years ago

@Mcat12 I think we should investigate how costly it is to compile libnettle along with FTLDNS on the CI.

AzureMarker commented 6 years ago

I'm trying to get a Pi Zero up and running to test this out manually. If the binary works with it dynamically linked, it should work statically as well.

DL6ER commented 6 years ago

@EngieDev a quick update for you: Both @Mcat12 and myself have been able to reproduce the exact same crash now on two different devices, we are looking at possible solutions in private communication and will report here once we found a solution.

frenchja commented 6 years ago

I believe I'm able to replicate the above behavior with a Model B, ARM6.

Token: 9ypjf7i87t. I'll also attempt to manually compile to test the difference in stability.

EDIT: Compiling libnettle along with FTL stablizes the pihole-ftl process.

AzureMarker commented 6 years ago

@EngieDev @frenchja Please test to see if this build does not crash for you:

pihole checkout ftl fix/dnssec-crash
OneRainbowDev commented 6 years ago

Testing now, will feedback later.

OneRainbowDev commented 6 years ago

While it no longer crashes, there is a definite (sudden) slowdown after about an hour and gets slower until it just doesn't seem to respond at all. I will keep testing over next few days if I can.

DL6ER commented 6 years ago

Is this when running pihole-FTL inside gdb or is this independent? I tried it myself over night on a fresh RasPi 1B+. While the old binary crashed with DNSSEC enabled, this one seems to work fine but I wasn't able to reproduce a slowdown even after having it for several hours.

OneRainbowDev commented 6 years ago

It seems to have just been a coincidence with some other issue, as since restarting the pi, it has not been reproduced - independent of gdb.

arjanvandermeer commented 6 years ago

I had discovered the same issue on my Pi Zero wh. I tried the above branch and it indeed no longer crashes once I enable DNSSEC. Will update my setup with this branch and then do some endurance testing over the next few days :)

arjanvandermeer commented 6 years ago

I have it for about 24 hours in my home network, with a test every few hours and the response timings did not change. I also did not encounter any other issues.

OneRainbowDev commented 6 years ago

I have had it running on my network for the past couple of days without any issues since the reboot, still no idea what the slowdown issue was caused by.

AzureMarker commented 6 years ago

I updated the libnettle on ftl.pi-hole.net to be the new not-crashing version, so all future builds should use the updated library.

DiJuMx commented 6 years ago

I am also having a similar (if not the same) problem with DNSSEC.

I am on the FTLDNS branch (for core, web, ftl). I am running on a Raspberry Pi B Rev 1 (/proc/cpuinfo reports hardware BCM2835, Revision 0002).

I am also running unbound on the same Pi.

I am able to trigger the crash by trying to resolve i.redd.it.

After crashing, the pihole-FTL service can only be restarted via pihole -r; rebooting, or systemctl restart pihole-FTL, or pihole restartdns, all fail to restart the service.

I have a debug token: mgnw0mh1za

DL6ER commented 6 years ago

@DiJuMx this seems unrelated, please open a new issue ticket

PromoFaux commented 6 years ago

FYI, we've just rebuild the arm binary of FTLDNS so this should resolve the issues talked about in this thread. Run pihole -up to get the latest binary.

Make sure you're on the FTLDNS branch and not fix/dnssec-crash! (You can verify this by examining the contents of /etc/pihole/ftlbranch)

AzureMarker commented 6 years ago

Closing this as it was only a bug in the FTLDNS beta and not stable.