pi-hole / FTL

The Pi-hole FTL engine
https://pi-hole.net
Other
1.39k stars 196 forks source link

FTL v54b4ad93 crashes sometimes (several times today) for unknown reasons #2112

Closed schuettecarsten closed 3 days ago

schuettecarsten commented 2 weeks ago

FTL crashed for unknown reasons several times today. Here is a full log;

  [i] Setting up user & group for the pihole user
  [i] PIHOLE_UID not set in environment, using default (100)
  [i] PIHOLE_GID not set in environment, using default (101)

  [i] Starting FTL configuration
  [i] Password already set in config file
  [i] Starting crond for scheduled scripts. Randomizing times for gravity and update checker

  [i] Ensuring logrotate script exists in /etc/pihole

  [i] Gravity migration checks
  [i] Existing gravity database found

  [i] pihole-FTL pre-start checks
  [i] Setting capabilities on pihole-FTL where possible
  [i] Applying the following caps to pihole-FTL:
        * CAP_CHOWN
        * CAP_NET_BIND_SERVICE
        * CAP_NET_RAW
        * CAP_NET_ADMIN
        * CAP_SYS_NICE

  [i] Starting pihole-FTL (no-daemon) as pihole

Core
    Version is 112b961 (Latest: null)
    Branch is development
    Hash is 112b9617 (Latest: 112b9617)
Web
    Version is c689fdc (Latest: null)
    Branch is development
    Hash is c689fdc7 (Latest: c689fdc7)
FTL
    Version is vDev-54b4ad9 (Latest: null)
    Branch is development
    Hash is 54b4ad93 (Latest: 54b4ad93)

2024-11-08 08:24:28.553 CET [52M] INFO: ########## FTL started on 2323f1bec446! ##########
2024-11-08 08:24:28.554 CET [52M] INFO: FTL branch: development
2024-11-08 08:24:28.554 CET [52M] INFO: FTL version: vDev-54b4ad9
2024-11-08 08:24:28.554 CET [52M] INFO: FTL commit: 54b4ad93
2024-11-08 08:24:28.554 CET [52M] INFO: FTL date: 2024-11-07 20:27:52 +0100
2024-11-08 08:24:28.554 CET [52M] INFO: FTL user: pihole
2024-11-08 08:24:28.554 CET [52M] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
2024-11-08 08:24:28.559 CET [52M] INFO: Wrote config file:
2024-11-08 08:24:28.559 CET [52M] INFO:  - 150 total entries
2024-11-08 08:24:28.559 CET [52M] INFO:  - 132 entries are default
2024-11-08 08:24:28.559 CET [52M] INFO:  - 18 entries are modified
2024-11-08 08:24:28.559 CET [52M] INFO:  - 0 entries are forced through environment
2024-11-08 08:24:28.561 CET [52M] INFO: Parsed config file /etc/pihole/pihole.toml successfully
2024-11-08 08:24:28.566 CET [52M] INFO: PID of FTL process: 52
2024-11-08 08:24:28.567 CET [52M] INFO: listening on 0.0.0.0 port 53
2024-11-08 08:24:28.567 CET [52M] INFO: listening on :: port 53
2024-11-08 08:24:28.569 CET [52M] INFO: PID of FTL process: 52
2024-11-08 08:24:28.571 CET [52M] INFO: Database version is 20
2024-11-08 08:24:28.572 CET [52M] INFO: Database successfully initialized
2024-11-08 08:24:33.978 CET [52M] INFO: Imported 172837 queries from the on-disk database (it has 17677658 rows)
2024-11-08 08:24:33.978 CET [52M] INFO: Parsing queries in database
2024-11-08 08:24:34.070 CET [52M] INFO:   10000 queries parsed...
2024-11-08 08:24:34.130 CET [52M] INFO:   20000 queries parsed...
2024-11-08 08:24:34.187 CET [52M] INFO:   30000 queries parsed...
2024-11-08 08:24:34.256 CET [52M] INFO:   40000 queries parsed...
2024-11-08 08:24:34.336 CET [52M] INFO:   50000 queries parsed...
2024-11-08 08:24:34.401 CET [52M] INFO:   60000 queries parsed...
2024-11-08 08:24:34.471 CET [52M] INFO:   70000 queries parsed...
2024-11-08 08:24:34.538 CET [52M] INFO:   80000 queries parsed...
2024-11-08 08:24:34.619 CET [52M] INFO:   90000 queries parsed...
2024-11-08 08:24:34.691 CET [52M] INFO:   100000 queries parsed...
2024-11-08 08:24:34.762 CET [52M] INFO:   110000 queries parsed...
2024-11-08 08:24:34.836 CET [52M] INFO:   120000 queries parsed...
2024-11-08 08:24:34.908 CET [52M] INFO:   130000 queries parsed...
2024-11-08 08:24:34.970 CET [52M] INFO:   140000 queries parsed...
2024-11-08 08:24:35.033 CET [52M] INFO:   150000 queries parsed...
2024-11-08 08:24:35.099 CET [52M] INFO:   160000 queries parsed...
2024-11-08 08:24:35.164 CET [52M] INFO:   170000 queries parsed...
2024-11-08 08:24:35.188 CET [52M] INFO: Imported 172835 queries from the long-term database
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Total DNS queries: 172835
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Cached DNS queries: 103540
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Forwarded DNS queries: 29905
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Blocked DNS queries: 35721
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Unknown DNS queries: 0
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Unique domains: 2394
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Unique clients: 93
2024-11-08 08:24:35.188 CET [52M] INFO:  -> DNS cache records: 1119
2024-11-08 08:24:35.188 CET [52M] INFO:  -> Known forward destinations: 2
2024-11-08 08:24:35.430 CET [52M] INFO: NTP sync is disabled
2024-11-08 08:24:35.431 CET [52M] INFO: FTL is running as user pihole (UID 100)
2024-11-08 08:24:35.431 CET [52M] INFO: Reading certificate from /etc/pihole/tls.pem ...
2024-11-08 08:24:35.432 CET [52M] INFO: Using SSL/TLS certificate file /etc/pihole/tls.pem
2024-11-08 08:24:35.437 CET [52M] INFO: Restored 0 API sessions from the database
2024-11-08 08:24:35.442 CET [52M] INFO: Blocking status is enabled
2024-11-08 08:24:35.877 CET [52/T157] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 338.9 msec
2024-11-08 08:24:35.883 CET [174/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:35.908 CET [174/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 25.2 msec
2024-11-08 08:24:35.916 CET [165/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:35.936 CET [165/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 19.8 msec
2024-11-08 08:24:35.937 CET [163/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:35.959 CET [163/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 21.8 msec
2024-11-08 08:24:35.960 CET [170/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:35.979 CET [170/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 19.6 msec
2024-11-08 08:24:35.980 CET [166/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.007 CET [166/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 27.3 msec
2024-11-08 08:24:36.009 CET [52M] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.307 CET [52M] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 297.0 msec
2024-11-08 08:24:36.308 CET [169/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.570 CET [169/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 262.3 msec
2024-11-08 08:24:36.571 CET [168/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.599 CET [168/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 27.9 msec
2024-11-08 08:24:36.600 CET [174/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.881 CET [174/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 280.6 msec
2024-11-08 08:24:36.883 CET [175/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.901 CET [175/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 18.4 msec
2024-11-08 08:24:36.914 CET [173/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.933 CET [173/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 18.6 msec
2024-11-08 08:24:36.933 CET [172/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.959 CET [172/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 24.7 msec
2024-11-08 08:24:36.959 CET [174/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:36.981 CET [174/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 21.5 msec
2024-11-08 08:24:36.981 CET [168/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.001 CET [168/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 19.3 msec
2024-11-08 08:24:37.002 CET [52M] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.278 CET [52M] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 276.0 msec
2024-11-08 08:24:37.279 CET [173/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.297 CET [173/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 18.1 msec
2024-11-08 08:24:37.297 CET [175/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.316 CET [175/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 18.1 msec
2024-11-08 08:24:37.317 CET [171/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.338 CET [171/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 21.5 msec
2024-11-08 08:24:37.343 CET [167/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.612 CET [167/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 268.0 msec
2024-11-08 08:24:37.619 CET [164/F52] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:37.888 CET [164/F52] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 268.7 msec
2024-11-08 08:24:37.893 CET [52M] INFO: Reloading externally changed regular expressions
2024-11-08 08:24:38.146 CET [52M] INFO: Compiled 5 allow and 3 deny regex for 93 clients in 253.1 msec
2024-11-08 08:37:21.212 CET [52M] INFO: Rate-limiting 192.168.11.1 for at least 4 seconds
2024-11-08 08:37:25.562 CET [52/T158] INFO: Ending rate-limitation of 192.168.11.1
2024-11-08 08:37:29.178 CET [52M] INFO: Rate-limiting 192.168.11.1 for at least 6 seconds
2024-11-08 08:37:35.562 CET [52/T158] INFO: Ending rate-limitation of 192.168.11.1
2024-11-08 08:37:42.262 CET [52M] INFO: Rate-limiting 192.168.11.1 for at least 3 seconds
2024-11-08 08:37:45.562 CET [52/T158] INFO: Ending rate-limitation of 192.168.11.1
2024-11-08 08:55:51.008 CET [52/T157] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:55:51.008 CET [52/T157] INFO: ---------------------------->  FTL crashed!  <----------------------------
2024-11-08 08:55:51.008 CET [52/T157] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:55:51.008 CET [52/T157] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
2024-11-08 08:55:51.008 CET [52/T157] INFO: and include in your report already the following details:
2024-11-08 08:55:51.008 CET [52/T157] INFO: FTL has been running for 1883 seconds
2024-11-08 08:55:51.009 CET [52/T157] INFO: FTL branch: development
2024-11-08 08:55:51.009 CET [52/T157] INFO: FTL version: vDev-54b4ad9
2024-11-08 08:55:51.009 CET [52/T157] INFO: FTL commit: 54b4ad93
2024-11-08 08:55:51.009 CET [52/T157] INFO: FTL date: 2024-11-07 20:27:52 +0100
2024-11-08 08:55:51.009 CET [52/T157] INFO: FTL user: started as pihole, ended as pihole
2024-11-08 08:55:51.009 CET [52/T157] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
2024-11-08 08:55:51.009 CET [52/T157] INFO: Process details: MID: 52
2024-11-08 08:55:51.009 CET [52/T157] INFO:                  PID: 52
2024-11-08 08:55:51.009 CET [52/T157] INFO:                  TID: 157
2024-11-08 08:55:51.010 CET [52/T157] INFO:                  Name: database
2024-11-08 08:55:51.010 CET [52/T157] INFO: Received signal: Segmentation fault
2024-11-08 08:55:51.010 CET [52/T157] INFO:      at address: 0x66540700000582
2024-11-08 08:55:51.010 CET [52/T157] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
2024-11-08 08:55:51.010 CET [52/T157] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
2024-11-08 08:55:51.010 CET [52/T157] INFO: ------ Listing content of directory /dev/shm ------
2024-11-08 08:55:51.010 CET [52/T157] INFO: File Mode User:Group      Size  Filename
2024-11-08 08:55:51.010 CET [52/T157] INFO: rwxrwxrwx root:root       360   .
2024-11-08 08:55:51.011 CET [52/T157] INFO: rwxr-xr-x root:root       320   ..
2024-11-08 08:55:51.011 CET [52/T157] INFO: rw------- pihole:pihole    88   FTL-lock
2024-11-08 08:55:51.011 CET [52/T157] INFO: rw------- pihole:pihole   328   FTL-counters
2024-11-08 08:55:51.012 CET [52/T157] INFO: rw------- pihole:pihole   136   FTL-settings
2024-11-08 08:55:51.012 CET [52/T157] INFO: rw------- pihole:pihole   164K  FTL-strings
2024-11-08 08:55:51.012 CET [52/T157] INFO: rw------- pihole:pihole   102K  FTL-domains
2024-11-08 08:55:51.013 CET [52/T157] INFO: rw------- pihole:pihole   348K  FTL-clients
2024-11-08 08:55:51.013 CET [52/T157] INFO: rw------- pihole:pihole    29K  FTL-upstreams
2024-11-08 08:55:51.013 CET [52/T157] INFO: rw------- pihole:pihole    13M  FTL-queries
2024-11-08 08:55:51.013 CET [52/T157] INFO: rw------- pihole:pihole     8K  FTL-overTime
2024-11-08 08:55:51.014 CET [52/T157] INFO: rw------- pihole:pihole   102K  FTL-dns-cache
2024-11-08 08:55:51.014 CET [52/T157] INFO: rw------- pihole:pihole     4K  FTL-per-client-regex
2024-11-08 08:55:51.014 CET [52/T157] INFO: rw------- pihole:pihole   569K  FTL-fifo-log
2024-11-08 08:55:51.015 CET [52/T157] INFO: rw------- pihole:pihole     4K  FTL-clients-lookup
2024-11-08 08:55:51.015 CET [52/T157] INFO: rw------- pihole:pihole    29K  FTL-domains-lookup
2024-11-08 08:55:51.015 CET [52/T157] INFO: rw------- pihole:pihole    20K  FTL-dns-cache-lookup
2024-11-08 08:55:51.015 CET [52/T157] INFO: rw------- pihole:pihole   786K  FTL-recycler
2024-11-08 08:55:51.016 CET [52/T157] INFO: ---------------------------------------------------
2024-11-08 08:55:51.016 CET [52/T157] INFO: Please also include some lines from above the !!!!!!!!! header.
2024-11-08 08:55:51.016 CET [52/T157] INFO: Thank you for helping us to improve our FTL engine!
2024-11-08 08:55:51.016 CET [52/T157] INFO: Waiting for threads to join
2024-11-08 08:55:51.038 CET [52/T160] INFO: Terminating timer thread
2024-11-08 08:55:51.362 CET [52/T159] INFO: Terminating resolver thread
2024-11-08 08:55:51.682 CET [52/T158] INFO: Terminating GC thread
2024-11-08 08:55:53.016 CET [52/T157] INFO: Thread database (0) is still busy, cancelling it.
2024-11-08 08:55:53.017 CET [52M] ERROR: Error when obtaining outer SHM lock: Previous owner died
2024-11-08 08:55:53.017 CET [52M] ERROR: Error when obtaining inner SHM lock: Previous owner died
/bin/bash: line 1:    52 Segmentation fault      /usr/bin/pihole-FTL no-daemon > /dev/null
2024-11-08 08:58:41.583 CET [52/T224] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:58:41.583 CET [52/T224] INFO: ---------------------------->  FTL crashed!  <----------------------------
2024-11-08 08:58:41.583 CET [52/T224] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:58:41.583 CET [52/T224] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
2024-11-08 08:58:41.584 CET [52/T224] INFO: and include in your report already the following details:
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL has been running for 2053 seconds
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL branch: development
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL version: vDev-54b4ad9
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL commit: 54b4ad93
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL date: 2024-11-07 20:27:52 +0100
2024-11-08 08:58:41.584 CET [52/T224] INFO: FTL user: started as pihole, ended as pihole
2024-11-08 08:58:41.584 CET [52/T224] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
2024-11-08 08:58:41.584 CET [52/T224] INFO: Process details: MID: 52
2024-11-08 08:58:41.585 CET [52/T224] INFO:                  PID: 52
2024-11-08 08:58:41.585 CET [52/T224] INFO:                  TID: 224
2024-11-08 08:58:41.585 CET [52/T224] INFO:                  Name: civetweb-worker
2024-11-08 08:58:41.585 CET [52/T224] INFO: Received signal: Segmentation fault
2024-11-08 08:58:41.585 CET [52/T224] INFO:      at address: 0x66540700000582
2024-11-08 08:58:41.585 CET [52/T224] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
2024-11-08 08:58:41.585 CET [52/T224] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
2024-11-08 08:58:41.585 CET [52/T224] INFO: ------ Listing content of directory /dev/shm ------
2024-11-08 08:58:41.586 CET [52/T224] INFO: File Mode User:Group      Size  Filename
2024-11-08 08:58:41.586 CET [52/T224] INFO: rwxrwxrwx root:root       360   .
2024-11-08 08:58:41.586 CET [52/T224] INFO: rwxr-xr-x root:root       320   ..
2024-11-08 08:58:41.586 CET [52/T224] INFO: rw------- pihole:pihole    88   FTL-lock
2024-11-08 08:58:41.587 CET [52/T224] INFO: rw------- pihole:pihole   328   FTL-counters
2024-11-08 08:58:41.587 CET [52/T224] INFO: rw------- pihole:pihole   136   FTL-settings
2024-11-08 08:58:41.587 CET [52/T224] INFO: rw------- pihole:pihole   164K  FTL-strings
2024-11-08 08:58:41.587 CET [52/T224] INFO: rw------- pihole:pihole   102K  FTL-domains
2024-11-08 08:58:41.588 CET [52/T184] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:58:41.588 CET [52/T224] INFO: rw------- pihole:pihole   348K  FTL-clients
2024-11-08 08:58:41.588 CET [52/T184] INFO: ---------------------------->  FTL crashed!  <----------------------------
2024-11-08 08:58:41.588 CET [52/T184] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-08 08:58:41.588 CET [52/T184] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
2024-11-08 08:58:41.588 CET [52/T184] INFO: and include in your report already the following details:
2024-11-08 08:58:41.588 CET [52/T224] INFO: rw------- pihole:pihole    29K  FTL-upstreams
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL has been running for 2053 seconds
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL branch: development
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL version: vDev-54b4ad9
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL commit: 54b4ad93
2024-11-08 08:58:41.588 CET [52/T224] INFO: rw------- pihole:pihole    13M  FTL-queries
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL date: 2024-11-07 20:27:52 +0100
2024-11-08 08:58:41.588 CET [52/T184] INFO: FTL user: started as pihole, ended as pihole
2024-11-08 08:58:41.589 CET [52/T184] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
2024-11-08 08:58:41.589 CET [52/T224] INFO: rw------- root:pihole       8K  FTL-overTime
2024-11-08 08:58:41.589 CET [52/T184] INFO: Process details: MID: 52
2024-11-08 08:58:41.589 CET [52/T184] INFO:                  PID: 52
2024-11-08 08:58:41.589 CET [52/T184] INFO:                  TID: 184
2024-11-08 08:58:41.589 CET [52/T184] INFO:                  Name: civetweb-worker
2024-11-08 08:58:41.589 CET [52/T224] INFO: rw------- pihole:pihole   102K  FTL-dns-cache
2024-11-08 08:58:41.589 CET [52/T184] INFO: Received signal: Segmentation fault
2024-11-08 08:58:41.589 CET [52/T184] INFO:      at address: 0x66540700000582
2024-11-08 08:58:41.589 CET [52/T184] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
2024-11-08 08:58:41.589 CET [52/T184] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
2024-11-08 08:58:41.589 CET [52/T224] INFO: rw------- pihole:pihole     4K  FTL-per-client-regex
2024-11-08 08:58:41.589 CET [52/T184] INFO: ------ Listing content of directory /dev/shm ------
2024-11-08 08:58:41.589 CET [52/T184] INFO: File Mode User:Group      Size  Filename
2024-11-08 08:58:41.589 CET [52/T224] INFO: rw------- pihole:pihole   569K  FTL-fifo-log
2024-11-08 08:58:41.590 CET [52/T184] INFO: rwxrwxrwx root:root       360   .

  [i] pihole-FTL exited with status 139
DL6ER commented 2 weeks ago

Could you please run pihole-FTL connected to a debugger so we can find out where inside pihole-FTL this crash happens?

https://deploy-preview-338--pihole-docs.netlify.app/ftldns/debugging/

has all the details incl. a special section how to do it inside a docker container.

bermeitinger-b commented 1 week ago

I can see the same happening both for nightly and development tags.

This is what is shown after the mentioned debugging procedure inside the docker container:

Thread 1 "pihole-FTL" received signal SIG37, Real-time event 37.
__cp_end () at src/thread/x86_64/syscall_cp.s:29
29      in src/thread/x86_64/syscall_cp.s
(gdb) backtrace
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00000000007f19e3 in __syscall_cp_c (nr=7, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, 
    y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00000000007ea619 in poll (fds=<optimized out>, n=<optimized out>, timeout=timeout@entry=-1) at src/select/poll.c:9
#3  0x00000000004e9762 in poll (__s=-1, __n=<optimized out>, __f=<optimized out>) at /usr/include/fortify/poll.h:40
#4  0x00000000004bafbc in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at /app/src/dnsmasq/dnsmasq.c:1183
#5  0x00000000004020d7 in main (argc=<optimized out>, argv=0x7ffe637f30e8) at /app/src/main.c:123

It doesn't look very helpful.

DL6ER commented 1 week ago

Seems, you need to need to add SIG37 nostop similar to the other signals in the documentation to skip this useless signal. You are perfectly right about it not being useful.

bermeitinger-b commented 1 week ago

I have another with

Thread 4 "housekeeper" received signal SIG33, Real-time event 33.
[Switching to LWP 113]
__cp_end () at src/thread/x86_64/syscall_cp.s:29
29      in src/thread/x86_64/syscall_cp.s
(gdb) backtrace
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00000000007f19e3 in __syscall_cp_c (nr=23, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, 
    y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00000000007ea700 in select (n=n@entry=0, rfds=rfds@entry=0x0, wfds=wfds@entry=0x0, efds=efds@entry=0x0, tv=tv@entry=0x734f1354ba00)
    at src/select/select.c:39
#3  0x00000000006c0e29 in FTLselect (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, 
    exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x734f1354ba00, file=file@entry=0x80069f "/app/src/timers.c", 
    func=0x867d40 <__FUNCTION__.0> "sleepms", line=65) at /app/src/syscalls/select.c:25
#4  0x000000000042695a in sleepms (milliseconds=<optimized out>) at /app/src/timers.c:65
#5  0x0000000000426661 in thread_sleepms (thread=thread@entry=GC, milliseconds=<optimized out>) at /app/src/signals.c:496
#6  0x0000000000418357 in GC_thread (val=<optimized out>) at /app/src/gc.c:645
#7  0x00000000007f26d0 in start (p=0x734f1354bb00) at src/thread/pthread_create.c:207
#8  0x00000000007f3d5c in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
DL6ER commented 1 week ago

Okay, so SIG33 is another one for the nostop here. Sorry about that. What we are looking for is a SIGSEGV

pralor-bot commented 1 week ago

This issue has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/pi-hole-stopped-responding/73931/4

bermeitinger-b commented 1 week ago

Thanks for your patience. However, when also adding SIG33 nostop, gdb doesn't stop at all:

[Detaching after fork from child process 197]

Thread 1 "pihole-FTL" received signal SIG37, Real-time event 37.

Thread 1 "pihole-FTL" received signal SIGTERM, Terminated.

Thread 1 "pihole-FTL" received signal SIG41, Real-time event 41.
[LWP 136 exited]
[LWP 139 exited]

Thread 2 "ntp-client" received signal SIG33, Real-time event 33.

Thread 4 "housekeeper" received signal SIG33, Real-time event 33.

Thread 5 "dns-client" received signal SIG33, Real-time event 33.
[LWP 137 exited]
[LWP 138 exited]
[LWP 135 exited]
[LWP 141 exited]
[LWP 140 exited]
[Inferior 1 (process 48) exited with code 01]
DL6ER commented 1 week ago

This means there was no crash. Why FTL exited with exit code 01 will be found in /var/log/Pi-hole/FTL.log. Unfortunately, the debugger cannot see this.

bermeitinger-b commented 1 week ago

This is what the log contains:

2024-11-11 10:02:01.564 CET [3270/F52] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-11 10:02:01.564 CET [3270/F52] INFO: ---------------------------->  FTL crashed!  <----------------------------
2024-11-11 10:02:01.564 CET [3270/F52] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2024-11-11 10:02:01.564 CET [3270/F52] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
2024-11-11 10:02:01.564 CET [3270/F52] INFO: and include in your report already the following details:
2024-11-11 10:02:01.564 CET [3270/F52] INFO: FTL has been running for 4195 seconds
2024-11-11 10:02:01.564 CET [3270/F52] INFO: FTL branch: development
2024-11-11 10:02:01.565 CET [3270/F52] INFO: FTL version: vDev-848367f
2024-11-11 10:02:01.565 CET [3270/F52] INFO: FTL commit: 848367fa
2024-11-11 10:02:01.565 CET [3270/F52] INFO: FTL date: 2024-11-08 19:13:07 +0100
2024-11-11 10:02:01.565 CET [3270/F52] INFO: FTL user: started as pihole, ended as pihole
2024-11-11 10:02:01.565 CET [3270/F52] INFO: Compiled for linux/amd64 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
2024-11-11 10:02:01.565 CET [3270/F52] INFO: Process details: MID: 52
2024-11-11 10:02:01.565 CET [3270/F52] INFO:                  PID: 3270
2024-11-11 10:02:01.565 CET [3270/F52] INFO:                  TID: 3270
2024-11-11 10:02:01.565 CET [3270/F52] INFO:                  Name: pihole-FTL
2024-11-11 10:02:01.565 CET [3270/F52] INFO: Received signal: Segmentation fault
2024-11-11 10:02:01.565 CET [3270/F52] INFO:      at address: 0x100e00003c08
2024-11-11 10:02:01.565 CET [3270/F52] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
2024-11-11 10:02:01.565 CET [3270/F52] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
2024-11-11 10:02:01.565 CET [3270/F52] INFO: ------ Listing content of directory /dev/shm ------
2024-11-11 10:02:01.565 CET [3270/F52] INFO: File Mode User:Group      Size  Filename
2024-11-11 10:02:01.565 CET [3270/F52] INFO: rwxrwxrwx root:root       360   .
2024-11-11 10:02:01.565 CET [3270/F52] INFO: rwxr-xr-x root:root       340   ..
2024-11-11 10:02:01.565 CET [3270/F52] INFO: rw------- pihole:pihole    88   FTL-lock
2024-11-11 10:02:01.565 CET [3270/F52] INFO: rw------- pihole:pihole   328   FTL-counters
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   136   FTL-settings
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   123K  FTL-strings
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole    78K  FTL-domains
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   696K  FTL-clients
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole    29K  FTL-upstreams
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole     5M  FTL-queries
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole     8K  FTL-overTime
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   123K  FTL-dns-cache
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole     8K  FTL-per-client-regex
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   569K  FTL-fifo-log
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole     8K  FTL-clients-lookup
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole    20K  FTL-domains-lookup
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole    16K  FTL-dns-cache-lookup
2024-11-11 10:02:01.566 CET [3270/F52] INFO: rw------- pihole:pihole   786K  FTL-recycler
2024-11-11 10:02:01.566 CET [3270/F52] INFO: ---------------------------------------------------
2024-11-11 10:02:01.566 CET [3270/F52] INFO: Please also include some lines from above the !!!!!!!!! header.
2024-11-11 10:02:01.567 CET [3270/F52] INFO: Thank you for helping us to improve our FTL engine!
2024-11-11 10:02:01.567 CET [3270/F52] INFO: Asking parent pihole-FTL (PID 52) to shut down
2024-11-11 10:02:01.567 CET [3270/F52] INFO: FTL fork terminated!
2024-11-11 10:02:01.567 CET [52M] INFO: Received: RT37 (37 -> 2)
2024-11-11 10:02:01.570 CET [52M] INFO: Asked to terminate by "/usr/bin/pihole-FTL no-daemon" (PID 52, user pihole UID 100)
2024-11-11 10:02:01.594 CET [52/T117] INFO: Terminating database thread
2024-11-11 10:02:01.611 CET [52/T120] INFO: Terminating timer thread
2024-11-11 10:02:01.977 CET [52/T119] INFO: Terminating resolver thread
2024-11-11 10:02:02.009 CET [52M] INFO: Finished final database update
2024-11-11 10:02:02.009 CET [52M] INFO: Waiting for threads to join
2024-11-11 10:02:02.009 CET [52M] INFO: Thread housekeeper (1) is idle, terminating it.
2024-11-11 10:02:02.009 CET [52M] INFO: Thread ntp-client (4) is idle, terminating it.
2024-11-11 10:02:02.009 CET [52M] INFO: All threads joined
2024-11-11 10:02:02.018 CET [52M] INFO: Stored 0 API sessions in the database
2024-11-11 10:02:02.682 CET [52M] INFO: ########## FTL terminated after 1h 9m 56s  (code 1)! ##########

Note, that this happens on the current development image, as well as yesterday's and today's nightly image. The log above is from the latest nightly because while it crashes, FTL recovers quicker while development does not and sometimes takes down the container.

DL6ER commented 1 week ago
3270/F52

from e.g. the first line says the crash happened in process with PID 3270 which is a Fork of process 52. Unfortunately, gdb cannot really trace forks, see https://sourceware.org/gdb/current/onlinedocs/gdb.html/Forks.html

On most systems, GDB has no special support for debugging programs which create additional processes using the fork function. When a program forks, GDB will continue to debug the parent process and the child process will run unimpeded. [...]

´The first crash you have reported here had the crash in 52/T158 which is the original processes' thread with ID 158. gdb would have caught another crash therein.

According to the link above, Linux supports setting

set follow-fork-mode child

allowing gdb to follow the forked child. But if FTL crashes in the main process in between, we will not have caught this...

edit Clarified above and suggested a different command to run

DL6ER commented 1 week ago
set detach-on-fork off

might work, too, but I am not sure if suspending the main process is still a healthy system. It may have other consequences. Easiest would probably be waiting until we get a crash not in a fork so gdb can provide all the necessary information for us.

bermeitinger-b commented 1 week ago

With set detach-on-fork off

(gdb)  set detach-on-fork off
(gdb) continue
Continuing.
[New inferior 2 (process 2592)]
Can not resume the parent process over vfork in the foreground while
holding the child stopped.  Try "set detach-on-fork" or "set schedule-multiple".
[Switching to LWP 153]
__clone () at src/thread/x86_64/clone.s:18
warning: 18     src/thread/x86_64/clone.s: No such file or directory
(gdb) backtrace
#0  __clone () at src/thread/x86_64/clone.s:18
#1  0x00000000007fa8fc in posix_spawn (res=res@entry=0x7dd145a616a4, path=path@entry=0x80fc5e "/bin/sh", fa=fa@entry=0x7dd145a616d0,
    attr=attr@entry=0x0, argv=argv@entry=0x7dd145a616b0, envp=<optimized out>) at src/process/posix_spawn.c:198
#2  0x00000000007eca1e in popen (cmd=cmd@entry=0x7dd145a61802 "ip neigh show", mode=mode@entry=0x80efa9 "r") at src/stdio/popen.c:41
#3  0x000000000049db1e in parse_neighbor_cache (db=0x7dd14c672ca8) at /app/src/database/network-table.c:1307
#4  0x00000000004904ec in DB_thread (val=<optimized out>) at /app/src/database/database-thread.c:199
#5  0x00000000007f26d0 in start (p=0x7dd145a61b00) at src/thread/pthread_create.c:207
#6  0x00000000007f3d5c in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC

With set follow-fork-mode child:

(gdb) continue
Continuing.
[Attaching after LWP 49 fork to child process 179]
[New inferior 2 (process 179)]

Thread 2.1 "pihole-FTL" received signal SIGPIPE, Broken pipe.
[Inferior 2 (process 179) exited normally]
schuettecarsten commented 1 week ago

Maybe it's a good idea to create a development build that includes debug symbols just to have a valid stack trace output?

bermeitinger-b commented 1 week ago

I've also seen that the Network overview show this: Image

schuettecarsten commented 1 week ago

Here is "my" crash:


 2024-11-11 11:15:42.529 CET [712/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.535 CET [713/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.536 CET [712/F49] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 2024-11-11 11:15:42.537 CET [712/F49] INFO: ---------------------------->  FTL crashed!  <----------------------------
 2024-11-11 11:15:42.537 CET [712/F49] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 2024-11-11 11:15:42.537 CET [712/F49] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
 2024-11-11 11:15:42.538 CET [712/F49] INFO: and include in your report already the following details:
 2024-11-11 11:15:42.540 CET [712/F49] INFO: FTL has been running for 1016 seconds
 2024-11-11 11:15:42.540 CET [712/F49] INFO: FTL branch: development
 2024-11-11 11:15:42.541 CET [712/F49] INFO: FTL version: vDev-848367f
 2024-11-11 11:15:42.541 CET [712/F49] INFO: FTL commit: 848367fa
 2024-11-11 11:15:42.541 CET [712/F49] INFO: FTL date: 2024-11-08 19:13:07 +0100
 2024-11-11 11:15:42.542 CET [712/F49] INFO: FTL user: started as pihole, ended as pihole
 2024-11-11 11:15:42.542 CET [714/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.542 CET [712/F49] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
 2024-11-11 11:15:42.543 CET [712/F49] INFO: Process details: MID: 49
 2024-11-11 11:15:42.543 CET [712/F49] INFO:                  PID: 712
 2024-11-11 11:15:42.543 CET [712/F49] INFO:                  TID: 712
 2024-11-11 11:15:42.543 CET [712/F49] INFO:                  Name: pihole-FTL
 2024-11-11 11:15:42.544 CET [712/F49] INFO: Received signal: Segmentation fault
 2024-11-11 11:15:42.544 CET [712/F49] INFO:      at address: 0x4c0cc200000765
 2024-11-11 11:15:42.544 CET [712/F49] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
 2024-11-11 11:15:42.545 CET [712/F49] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
 2024-11-11 11:15:42.545 CET [712/F49] INFO: ------ Listing content of directory /dev/shm ------
 2024-11-11 11:15:42.546 CET [712/F49] INFO: File Mode User:Group      Size  Filename
 2024-11-11 11:15:42.546 CET [712/F49] INFO: rwxrwxrwx root:root       360   .
 2024-11-11 11:15:42.547 CET [712/F49] INFO: rwxr-xr-x root:root       320   ..
 2024-11-11 11:15:42.547 CET [712/F49] INFO: rw------- pihole:pihole    88   FTL-lock
 2024-11-11 11:15:42.547 CET [712/F49] INFO: rw------- pihole:pihole   328   FTL-counters
 2024-11-11 11:15:42.548 CET [712/F49] INFO: rw------- pihole:pihole   136   FTL-settings
 2024-11-11 11:15:42.548 CET [712/F49] INFO: rw------- pihole:pihole   164K  FTL-strings
 2024-11-11 11:15:42.549 CET [712/F49] INFO: rw------- pihole:pihole   115K  FTL-domains
 2024-11-11 11:15:42.549 CET [712/F49] INFO: rw------- pihole:pihole   348K  FTL-clients
 2024-11-11 11:15:42.550 CET [712/F49] INFO: rw------- pihole:pihole    29K  FTL-upstreams
 2024-11-11 11:15:42.550 CET [712/F49] INFO: rw------- pihole:pihole    15M  FTL-queries
 2024-11-11 11:15:42.550 CET [712/F49] INFO: rw------- pihole:pihole     8K  FTL-overTime
 2024-11-11 11:15:42.550 CET [715/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.551 CET [712/F49] INFO: rw------- pihole:pihole   102K  FTL-dns-cache
 2024-11-11 11:15:42.551 CET [712/F49] INFO: rw------- pihole:pihole     4K  FTL-per-client-regex
 2024-11-11 11:15:42.551 CET [712/F49] INFO: rw------- pihole:pihole   569K  FTL-fifo-log
 2024-11-11 11:15:42.552 CET [712/F49] INFO: rw------- pihole:pihole     4K  FTL-clients-lookup
 2024-11-11 11:15:42.552 CET [712/F49] INFO: rw------- pihole:pihole    29K  FTL-domains-lookup
 2024-11-11 11:15:42.552 CET [712/F49] INFO: rw------- pihole:pihole    20K  FTL-dns-cache-lookup
 2024-11-11 11:15:42.553 CET [712/F49] INFO: rw------- pihole:pihole   786K  FTL-recycler
 2024-11-11 11:15:42.553 CET [712/F49] INFO: ---------------------------------------------------
 2024-11-11 11:15:42.553 CET [712/F49] INFO: Please also include some lines from above the !!!!!!!!! header.
 2024-11-11 11:15:42.553 CET [712/F49] INFO: Thank you for helping us to improve our FTL engine!
 2024-11-11 11:15:42.554 CET [712/F49] INFO: Asking parent pihole-FTL (PID 49) to shut down
 2024-11-11 11:15:42.554 CET [712/F49] INFO: FTL fork terminated!
 2024-11-11 11:15:42.555 CET [713/F49] ERROR: Error when obtaining outer SHM lock: Previous owner died
 2024-11-11 11:15:42.555 CET [713/F49] ERROR: Error when obtaining inner SHM lock: Previous owner died
 2024-11-11 11:15:42.556 CET [49/T152] INFO: Received: RT37 (37 -> 2)
 2024-11-11 11:15:42.556 CET [716/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.557 CET [713/F49] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 2024-11-11 11:15:42.558 CET [713/F49] INFO: ---------------------------->  FTL crashed!  <----------------------------
 2024-11-11 11:15:42.558 CET [713/F49] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 2024-11-11 11:15:42.558 CET [713/F49] INFO: Please report a bug at https://github.com/pi-hole/FTL/issues
 2024-11-11 11:15:42.558 CET [713/F49] INFO: and include in your report already the following details:
 2024-11-11 11:15:42.559 CET [713/F49] INFO: FTL has been running for 1016 seconds
 2024-11-11 11:15:42.559 CET [713/F49] INFO: FTL branch: development
 2024-11-11 11:15:42.559 CET [713/F49] INFO: FTL version: vDev-848367f
 2024-11-11 11:15:42.559 CET [713/F49] INFO: FTL commit: 848367fa
 2024-11-11 11:15:42.560 CET [713/F49] INFO: FTL date: 2024-11-08 19:13:07 +0100
 2024-11-11 11:15:42.560 CET [713/F49] INFO: FTL user: started as pihole, ended as pihole
 2024-11-11 11:15:42.561 CET [713/F49] INFO: Compiled for linux/arm64/v8 (compiled on CI) using cc (Alpine 13.2.1_git20240309) 13.2.1 20240309
 2024-11-11 11:15:42.561 CET [713/F49] INFO: Process details: MID: 49
 2024-11-11 11:15:42.561 CET [713/F49] INFO:                  PID: 713
 2024-11-11 11:15:42.562 CET [713/F49] INFO:                  TID: 713
 2024-11-11 11:15:42.562 CET [713/F49] INFO:                  Name: pihole-FTL
 2024-11-11 11:15:42.556 CET [49/T152] DEBUG_ANY: Received SIGTERM
 2024-11-11 11:15:42.562 CET [713/F49] INFO: Received signal: Segmentation fault
 2024-11-11 11:15:42.563 CET [713/F49] INFO:      at address: 0x4c0cc100000764
 2024-11-11 11:15:42.563 CET [717/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.563 CET [49/T152] INFO: Asked to terminate by "/usr/bin/pihole-FTL no-daemon" (PID 49, user pihole UID 100)
 2024-11-11 11:15:42.563 CET [713/F49] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)
 2024-11-11 11:15:42.563 CET [713/F49] INFO: !!! INFO: pihole-FTL has not been compiled with glibc/backtrace support, not generating one !!!
 2024-11-11 11:15:42.563 CET [49/T152] DEBUG_ANY: Sending SIGUSR6 to dnsmasq to stop DNS service
 2024-11-11 11:15:42.563 CET [713/F49] INFO: ------ Listing content of directory /dev/shm ------
 2024-11-11 11:15:42.564 CET [713/F49] INFO: File Mode User:Group      Size  Filename
 2024-11-11 11:15:42.564 CET [49M] DEBUG_ANY: dnsmasq received signal 41
 2024-11-11 11:15:42.564 CET [713/F49] INFO: rwxrwxrwx root:root       360   .
 2024-11-11 11:15:42.565 CET [713/F49] INFO: rwxr-xr-x root:root       320   ..
 2024-11-11 11:15:42.565 CET [713/F49] INFO: rw------- pihole:pihole    88   FTL-lock
 2024-11-11 11:15:42.565 CET [713/F49] INFO: rw------- pihole:pihole    88   FTL-lock
 2024-11-11 11:15:42.566 CET [713/F49] INFO: rw------- pihole:pihole   328   FTL-counters
 2024-11-11 11:15:42.566 CET [713/F49] INFO: rw------- pihole:pihole   136   FTL-settings
 2024-11-11 11:15:42.566 CET [713/F49] INFO: rw------- pihole:pihole   164K  FTL-strings
 2024-11-11 11:15:42.567 CET [713/F49] INFO: rw------- pihole:pihole   115K  FTL-domains
 2024-11-11 11:15:42.567 CET [713/F49] INFO: rw------- pihole:pihole   348K  FTL-clients
 2024-11-11 11:15:42.568 CET [713/F49] INFO: rw------- pihole:pihole    29K  FTL-upstreams
 2024-11-11 11:15:42.568 CET [713/F49] INFO: rw------- pihole:pihole    15M  FTL-queries
 2024-11-11 11:15:42.569 CET [713/F49] INFO: rw------- pihole:pihole     8K  FTL-overTime
 2024-11-11 11:15:42.569 CET [713/F49] INFO: rw------- pihole:pihole   102K  FTL-dns-cache
 2024-11-11 11:15:42.570 CET [713/F49] INFO: rw------- pihole:pihole     4K  FTL-per-client-regex
 2024-11-11 11:15:42.570 CET [713/F49] INFO: rw------- pihole:pihole   569K  FTL-fifo-log
 2024-11-11 11:15:42.570 CET [713/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.572 CET [717/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.572 CET [714/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.578 CET [718/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.589 CET [715/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.596 CET [716/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.596 CET [49M] DEBUG_ANY: Shutting down... // exit code 1 // jmpret 0
 2024-11-11 11:15:42.596 CET [719/F49] DEBUG_ANY: Reopening Gravity database for this fork
 2024-11-11 11:15:42.606 CET [719/F49] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:42.611 CET [49/T155] INFO: Terminating timer thread
 2024-11-11 11:15:42.670 CET [49/T152] INFO: Terminating database thread
 2024-11-11 11:15:42.726 CET [49/T154] INFO: Terminating resolver thread
 2024-11-11 11:15:43.209 CET [49/T153] INFO: Terminating GC thread
 2024-11-11 11:15:43.522 CET [49M] INFO: Finished final database update
 2024-11-11 11:15:43.522 CET [49M] INFO: Waiting for threads to join
 2024-11-11 11:15:43.522 CET [49M] INFO: All threads joined
 2024-11-11 11:15:43.523 CET [49M] DEBUG_ANY: Closing gravity database
 2024-11-11 11:15:43.526 CET [49M] INFO: Stored 1 API session in the database
 2024-11-11 11:15:44.366 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7fb750a000
 2024-11-11 11:15:44.366 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0x7fb74a3000 == 0x7fb750a000
 2024-11-11 11:15:44.366 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0x7faeed9000 == 0x7fb750a000
 2024-11-11 11:15:44.366 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0x7fb749c000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb750a000
 2024-11-11 11:15:44.367 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb750a000
 2024-11-11 11:15:44.368 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7faf276000
 2024-11-11 11:15:44.368 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0x7fb74a3000 == 0x7faf276000
 2024-11-11 11:15:44.368 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0x7faeed9000 == 0x7faf276000
 2024-11-11 11:15:44.368 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0x7fb749c000 == 0x7faf276000
 2024-11-11 11:15:44.368 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7faf276000
 2024-11-11 11:15:44.369 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7faf276000
 2024-11-11 11:15:44.369 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7faf276000
 2024-11-11 11:15:44.369 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7faf276000
 2024-11-11 11:15:44.369 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7faf276000
 2024-11-11 11:15:44.369 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7faf276000
 2024-11-11 11:15:44.370 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7fb7508000
 2024-11-11 11:15:44.370 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0x7fb74a3000 == 0x7fb7508000
 2024-11-11 11:15:44.370 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0x7faeed9000 == 0x7fb7508000
 2024-11-11 11:15:44.370 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0x7fb749c000 == 0x7fb7508000
 2024-11-11 11:15:44.370 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb7508000
 2024-11-11 11:15:44.371 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7faeed9000
 2024-11-11 11:15:44.372 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0x7fb74a3000 == 0x7faeed9000
 2024-11-11 11:15:44.372 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0x7faeed9000 == 0x7faeed9000
 2024-11-11 11:15:44.372 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-domains at 0x7faeed9000
 2024-11-11 11:15:44.372 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7fb74a3000
 2024-11-11 11:15:44.372 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0x7fb74a3000 == 0x7fb74a3000
 2024-11-11 11:15:44.373 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-clients at 0x7fb74a3000
 2024-11-11 11:15:44.373 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0x7fad93c000 == 0x7fad93c000
 2024-11-11 11:15:44.373 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-queries at 0x7fad93c000
 2024-11-11 11:15:44.377 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb749c000
 2024-11-11 11:15:44.378 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb749c000
 2024-11-11 11:15:44.378 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb749c000
 2024-11-11 11:15:44.378 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0x7fb749c000 == 0x7fb749c000
 2024-11-11 11:15:44.378 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-upstreams at 0x7fb749c000
 2024-11-11 11:15:44.378 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb7452000
 2024-11-11 11:15:44.379 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb7452000
 2024-11-11 11:15:44.379 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb7452000
 2024-11-11 11:15:44.379 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb7452000
 2024-11-11 11:15:44.379 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7fb7452000
 2024-11-11 11:15:44.379 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb7452000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb7503000
 2024-11-11 11:15:44.380 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb7503000
 2024-11-11 11:15:44.381 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb7503000
 2024-11-11 11:15:44.381 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb7503000
 2024-11-11 11:15:44.381 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7fb7503000
 2024-11-11 11:15:44.381 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb7503000
 2024-11-11 11:15:44.381 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb7503000
 2024-11-11 11:15:44.382 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb7503000
 2024-11-11 11:15:44.382 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb7503000
 2024-11-11 11:15:44.382 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb7503000
 2024-11-11 11:15:44.382 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7faedd1000
 2024-11-11 11:15:44.382 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7faedd1000
 2024-11-11 11:15:44.383 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7faedd1000
 2024-11-11 11:15:44.383 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7faedd1000
 2024-11-11 11:15:44.383 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0x7faedd1000 == 0x7faedd1000
 2024-11-11 11:15:44.383 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-dns-cache at 0x7faedd1000
 2024-11-11 11:15:44.383 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb744c000
 2024-11-11 11:15:44.384 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb744c000
 2024-11-11 11:15:44.385 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb744c000
 2024-11-11 11:15:44.385 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb744c000
 2024-11-11 11:15:44.385 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb744c000
 2024-11-11 11:15:44.385 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb73c1000
 2024-11-11 11:15:44.385 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb73c1000
 2024-11-11 11:15:44.386 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb73c1000
 2024-11-11 11:15:44.386 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb73c1000
 2024-11-11 11:15:44.386 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb73c1000
 2024-11-11 11:15:44.386 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0x7fb73c1000 == 0x7fb73c1000
 2024-11-11 11:15:44.386 CET [49M] DEBUG_SHMEM: Unmapping global pointer /FTL-fifo-log at 0x7fb73c1000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0x7fb73c0000 == 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-clients-lookup at 0x7fb73c0000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0x7fb7466000 == 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-domains-lookup at 0x7fb7466000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0 == 0x7fb748e000
 2024-11-11 11:15:44.387 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0 == 0x7fb748e000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0 == 0x7fb748e000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0x7fb748e000 == 0x7fb748e000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-dns-cache-lookup at 0x7fb748e000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 0: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 1: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 2: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 3: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 4: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 5: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 6: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 7: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 8: 0 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Pointer comparison at pos. 9: 0x7fb72fe000 == 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] DEBUG_SHMEM: Unmapping global pointer FTL-recycler at 0x7fb72fe000
 2024-11-11 11:15:44.388 CET [49M] INFO: ########## FTL terminated after 16m 57s  (code 1)! ##########

   [i] pihole-FTL exited with status 1
DL6ER commented 1 week ago

Maybe it's a good idea to create a development build that includes debug symbols just to have a valid stack trace output?

We already have the full set of debug symbols in the release builds, otherwise, you would not have seen the function names and related code lines in:

#0  __clone () at src/thread/x86_64/clone.s:18
#1  0x00000000007fa8fc in posix_spawn (res=res@entry=0x7dd145a616a4, path=path@entry=0x80fc5e "/bin/sh", fa=fa@entry=0x7dd145a616d0,
    attr=attr@entry=0x0, argv=argv@entry=0x7dd145a616b0, envp=<optimized out>) at src/process/posix_spawn.c:198
#2  0x00000000007eca1e in popen (cmd=cmd@entry=0x7dd145a61802 "ip neigh show", mode=mode@entry=0x80efa9 "r") at src/stdio/popen.c:41
#3  0x000000000049db1e in parse_neighbor_cache (db=0x7dd14c672ca8) at /app/src/database/network-table.c:1307
#4  0x00000000004904ec in DB_thread (val=<optimized out>) at /app/src/database/database-thread.c:199
#5  0x00000000007f26d0 in start (p=0x7dd145a61b00) at src/thread/pthread_create.c:207
#6  0x00000000007f3d5c in __clone () at src/thread/x86_64/clone.s:22

But this also shows that following forks/children won't work for us here. The location at where you entered backtrace was where the child had been created. Seems you'd need to run continue here for each forking. This is highly impractical...

@schuettecarsten Could you attach the debugger as well? This latest crash was in a fork, too, however, the first one (the very first post in this issue ticket) was in a normal thread of the main process (gdb would have caught that).

bermeitinger-b commented 1 week ago

Maybe here:

#0  sig_handler (sig=17) at /app/src/dnsmasq/dnsmasq.c:1312
#1  <signal handler called>
#2  __syscall_cp_c (nr=202, u=139057914187780, v=0, w=-2147483495, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
#3  0x00000000007fbad5 in __futex4_cp (to=<optimized out>, val=-2147483495, op=0, addr=0x7e78f1941004) at src/thread/__timedwait.c:24
#4  __timedwait_cp (addr=addr@entry=0x7e78f1941004, val=val@entry=-2147483495, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=0)
    at src/thread/__timedwait.c:52
#5  0x00000000007fbba2 in __timedwait (addr=addr@entry=0x7e78f1941004, val=-2147483495, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=0)
    at src/thread/__timedwait.c:68
#6  0x00000000007f33e4 in __pthread_mutex_timedlock (m=0x7e78f1941000, at=at@entry=0x0) at src/thread/pthread_mutex_timedlock.c:85
#7  0x00000000007f3226 in __pthread_mutex_lock (m=m@entry=0x7e78f1941000) at src/thread/pthread_mutex_lock.c:9
#8  0x00000000006c0b28 in FTLpthread_mutex_lock (__mutex=0x7e78f1941000, file=file@entry=0x800301 "/app/src/shmem.c", 
    func=func@entry=0x867c78 <__FUNCTION__.3> "_lock_shm", line=line@entry=351) at /app/src/syscalls/pthread_mutex_lock.c:23
#9  0x0000000000424586 in _lock_shm (func=func@entry=0x866560 <__FUNCTION__.2> "FTL_dnsmasq_log", line=line@entry=3687, 
    file=file@entry=0x7fec34 "/app/src/dnsmasq_interface.c") at /app/src/shmem.c:351
#10 0x000000000041367e in FTL_dnsmasq_log (payload=payload@entry=0x7ffd68bbe710 "query[A] pi.hole from 127.0.0.1", length=32)
    at /app/src/dnsmasq_interface.c:3687
#11 0x00000000004d4056 in my_syslog (priority=priority@entry=6, format=format@entry=0x806830 "%s %s%s%s %s%s") at /app/src/dnsmasq/log.c:330
#12 0x00000000004b15eb in _log_query (flags=flags@entry=524424, name=0x7e78f17331c0 "pi.hole", addr=addr@entry=0x7ffd68bbed64, 
    arg=<optimized out>, type=<optimized out>, file=file@entry=0x8072a6 "/app/src/dnsmasq/forward.c", line=1820)
    at /app/src/dnsmasq/cache.c:2374
#13 0x00000000004cba6d in _log_query_mysockaddr (line=1820, type=<optimized out>, arg=<optimized out>, addr=0x7ffd68bbed60, 
    name=<optimized out>, flags=524296) at /app/src/dnsmasq/forward.c:139
#14 _log_query_mysockaddr (line=1820, type=<optimized out>, arg=<optimized out>, addr=0x7ffd68bbed60, name=<optimized out>, flags=524296)
    at /app/src/dnsmasq/forward.c:133
#15 receive_query (listen=listen@entry=0x7e78f1948e00, now=now@entry=1731357142) at /app/src/dnsmasq/forward.c:1820
#16 0x00000000004b9063 in check_dns_listeners (now=now@entry=1731357142) at /app/src/dnsmasq/dnsmasq.c:1895
#17 0x00000000004bb278 in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at /app/src/dnsmasq/dnsmasq.c:1278
#18 0x00000000004020d7 in main (argc=<optimized out>, argv=0x7ffd68bbf158) at /app/src/main.c:123

or

#0  sig_handler (sig=17) at /app/src/dnsmasq/dnsmasq.c:1312
warning: 1312   /app/src/dnsmasq/dnsmasq.c: No such file or directory
(gdb) backtrace
#0  sig_handler (sig=17) at /app/src/dnsmasq/dnsmasq.c:1312
#1  <signal handler called>
#2  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#3  0x00000000007f19e3 in __syscall_cp_c (nr=0, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, 
    y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#4  0x00000000007f52b3 in read (fd=fd@entry=47, buf=buf@entry=0x7ffd68bbeed3, count=count@entry=1) at src/unistd/read.c:6
#5  0x0000000000506e8e in read (__n=<optimized out>, __s=<optimized out>, __f=47) at /usr/include/fortify/unistd.h:118
#6  read_write (fd=47, packet=packet@entry=0x7ffd68bbeed3 "", size=size@entry=1, rw=rw@entry=1) at /app/src/dnsmasq/util.c:780
#7  0x00000000004b8ff0 in check_dns_listeners (now=now@entry=1731357142) at /app/src/dnsmasq/dnsmasq.c:2012
#8  0x00000000004bb278 in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at /app/src/dnsmasq/dnsmasq.c:1278
#9  0x00000000004020d7 in main (argc=<optimized out>, argv=0x7ffd68bbf158) at /app/src/main.c:123

or

#0  sig_handler (sig=17) at /app/src/dnsmasq/dnsmasq.c:1312
#1  <signal handler called>
#2  __restore_sigs (set=set@entry=0x7ffd68bbedf0) at ./arch/x86_64/syscall_arch.h:40
#3  0x00000000007e97bf in fork () at src/process/fork.c:86
#4  0x00000000004b8fab in check_dns_listeners (now=now@entry=1731357268) at /app/src/dnsmasq/dnsmasq.c:1989
#5  0x00000000004bb278 in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at /app/src/dnsmasq/dnsmasq.c:1278
#6  0x00000000004020d7 in main (argc=<optimized out>, argv=0x7ffd68bbf158) at /app/src/main.c:123

I'm not experienced with gdb. Thank you for your patience. If these backtraces are not helpful, I apologize

DL6ER commented 1 week ago

Signal 17 is SIGCHLD and gdb shouldn't have stopped for this one. I have not seen this signal myself being used inside pihole-FTL but I guess it is another one to be added with nostop to your /root/.gdbinit file ... I'm still convinced we will finally get there!

bermeitinger-b commented 1 week ago

Yes, unfortunately, the crash does not appear right away. Maybe I'll catch it.

In the meantime, I suggest looking at parsing the network clients. The above screenshot goes on for many clients: Image

My Pihole is set up such that it only has 2 clients: 2 reverse proxies for DoH and DoT.

The Query Log shows the client only N. (Previously, it was the IP.)

DL6ER commented 1 week ago

@bermeitinger-b I don't think there is anything wrong with the network table TBH, my current assumption is that somehow handling of the strings got broken and your database got populated with garbage data. This would likely explain the crash you are seeing, too. But I am still not sure how/where it happens. I have meanwhile manually inspected the entire string processing code paths twice and did not find anything odd.

Please also enable DEBUG_SHMEM logging by running sudo pihole-FTL --config debug.shmem true - it will provide us with greater detail to FTL.log how strings are stored/reused. This should help us finding the issue.

bermeitinger-b commented 1 week ago

Here an excerpt after flushing the network table:

pihole  | 2024-11-12 09:54:47.521 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.521 CET [49/T155] DEBUG_SHMEM: Adding "localhost.lan" (len 14) to buffer in resolveAndAddHostname() (src/resolve.c:797), next_str_pos is 54983
pihole  | 2024-11-12 09:54:47.522 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.522 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^P
pihole  | 2024-11-12 09:54:47.523 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.523 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: 
pihole  | 2024-11-12 09:54:47.523 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.524 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: .        2.16.0.1
pihole  | 2024-11-12 09:54:47.524 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.524 CET [49/T155] DEBUG_SHMEM: Adding "pi.hole" (len 8) to buffer in resolveAndAddHostname() (src/resolve.c:797), next_str_pos is 54997
pihole  | 2024-11-12 09:54:47.526 CET [49/T155] DEBUG_SHMEM: Reusing existing string "localhost.lan" at 54983 in resolveAndAddHostname() (src/resolve.c:797)
pihole  | 2024-11-12 09:54:47.526 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: sunny-portal.de
pihole  | 2024-11-12 09:54:47.526 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.526 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: rtal.de
pihole  | 2024-11-12 09:54:47.527 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.527 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname:  �       
pihole  | 2024-11-12 09:54:47.527 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.527 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname:  �       ~^F
pihole  | 2024-11-12 09:54:47.528 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.528 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^M�      ^?^F
pihole  | 2024-11-12 09:54:47.528 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.528 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^M�      
pihole  | 2024-11-12 09:54:47.529 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.529 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^M�      �^F
pihole  | 2024-11-12 09:54:47.529 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.529 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Q�      �^F
pihole  | 2024-11-12 09:54:47.530 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.530 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Q�      �^F
pihole  | 2024-11-12 09:54:47.530 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.530 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Q�      
pihole  | 2024-11-12 09:54:47.531 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.531 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^W�      �^F
pihole  | 2024-11-12 09:54:47.531 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.532 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^W�      
pihole  | 2024-11-12 09:54:47.532 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.532 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Q�      �^F
pihole  | 2024-11-12 09:54:47.533 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.533 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Z�      �^F
pihole  | 2024-11-12 09:54:47.533 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.533 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^Z�      �^F
pihole  | 2024-11-12 09:54:47.534 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.534 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: ^W�      �^F
pihole  | 2024-11-12 09:54:47.534 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
pihole  | 2024-11-12 09:54:47.534 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname:  
pihole  | 2024-11-12 09:54:47.535 CET [49/T155] DEBUG_SHMEM: Not adding "" to buffer (unchanged)
DL6ER commented 1 week ago

Yes, this brings us a lot closer. Please restart pihole-FTL if you haven't done already after running the --config command above. This ensures all string processing will be printed right from the start of the pihole-FTL process. If you then again encounter such lines, e.g.

pihole  | 2024-11-12 09:54:47.524 CET [49/T155] WARNING: Invalid IPv4 address when trying to resolve hostname: .        2.16.0.1

you can try to search for the first occurrence of this strange string like

grep ".        2.16.0.1" /var/log/pihole/FTL.log
DL6ER commented 1 week ago

It'd also be helpful if you could run

sudo kill -s RTMIN+7 $(cat /run/pihole-FTL.pid)

and see if any errors regarding hash tables are reported in /var/log/pihole/FTL.log (can take a few seconds after running the command above)

bermeitinger-b commented 1 week ago

Thanks.

I've restarted and it prints a lot of Adding "<domain>" (len x) to buffer and seemingly for all domains I've visited (?). My shell highlights the E in this line:

2024-11-12 10:16:47.165 CET [49M] DEBUG_SHMEM: Adding "E" (len 2) to buffer in _findUpstreamID() (src/datastructure.c:174), nex
t_str_pos is 4687

Typically, this is an indicator that it's not a literal E. file FTL.log also shows data instead of ASCII text.

After many of these lines, it looks the output changes:

2024-11-12 10:16:47.356 CET [49M] DEBUG_SHMEM: Adding "personalsafety-pa.googleapis.com" (len 33) to buffer in _findDomainID() 
(src/datastructure.c:270), next_str_pos is 45448
2024-11-12 10:16:47.356 CET [49M] DEBUG_SHMEM: Reusing existing string "access-point.cloudmessaging.edge.microsoft.com" at 1451
7 in _findDomainID() (src/datastructure.c:270)
2024-11-12 10:16:47.356 CET [49M] DEBUG_SHMEM: Reusing existing string "sunny-portal.de" at 363 in _findClientID() (src/datastr
ucture.c:356)
2024-11-12 10:16:47.356 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.357 CET [49M] DEBUG_SHMEM: Reusing existing string "rtal.de" at 371 in _findClientID() (src/datastructure.c
:356)
2024-11-12 10:16:47.357 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.357 CET [49M] DEBUG_SHMEM: Adding "F" (len 2) to buffer in _findDomainID() (src/datastructure.c:270), next_
str_pos is 45481
2024-11-12 10:16:47.358 CET [49M] DEBUG_SHMEM: Adding "         " (len 4) to buffer in _findClientID() (src/datastructure.c:356
), next_str_pos is 45483
2024-11-12 10:16:47.358 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.358 CET [49M] DEBUG_SHMEM: Adding "         ~F" (len 6) to buffer in _findClientID() (src/datastructure.c:3
56), next_str_pos is 45487
2024-11-12 10:16:47.358 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.359 CET [49M] DEBUG_SHMEM: Adding "M        ?F" (len 6) to buffer in _findClientID() (src/datastructure.c:3
56), next_str_pos is 45493
2024-11-12 10:16:47.359 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.359 CET [49M] DEBUG_SHMEM: Adding "M        " (len 4) to buffer in _findClientID() (src/datastructure.c:356
), next_str_pos is 45499
2024-11-12 10:16:47.360 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.360 CET [49M] DEBUG_SHMEM: Adding "M        F" (len 6) to buffer in _findClientID() (src/datastructure.c:3
56), next_str_pos is 45503
2024-11-12 10:16:47.360 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 0
2024-11-12 10:16:47.361 CET [49M] DEBUG_SHMEM: Adding "Q        F" (len 6) to buffer in _findClientID() (src/datastructure.c:3
56), next_str_pos is 45509

About the hash collisions:

2024-11-12 10:30:27.814 CET [49M] INFO: Received: RT42 (42 -> 7)                                                               
2024-11-12 10:30:28.150 CET [49/T113] INFO: Searching for hash collisions in clients lookup table                              
2024-11-12 10:30:28.150 CET [49/T113] INFO: Found 0 hash collisions and 0 sorting errors in clients lookup table (scanned 100 e
lements)                                                                                                                       
2024-11-12 10:30:28.150 CET [49/T113] INFO: Searching for hash collisions in domains lookup table                              
2024-11-12 10:30:28.150 CET [49/T113] INFO: Found 0 hash collisions and 0 sorting errors in domains lookup table (scanned 2221 
elements)                                                                                                                      
2024-11-12 10:30:28.150 CET [49/T113] INFO: Searching for hash collisions in DNS cache lookup table                            
2024-11-12 10:30:28.150 CET [49/T113] INFO: Found 0 hash collisions and 0 sorting errors in DNS cache lookup table (scanned 137
0 elements)                                                                                                                    
2024-11-12 10:30:28.150 CET [49/T113] INFO: Recycle list fullness:                                                             
2024-11-12 10:30:28.150 CET [49/T113] INFO:   Clients: 0/65535 (0.00%)
2024-11-12 10:30:28.150 CET [49/T113] INFO:   Domains: 0/65535 (0.00%)
2024-11-12 10:30:28.150 CET [49/T113] INFO:   DNS Cache: 0/65535 (0.00%)

Seems okay, however, I'm not sure why it scans 100 clients. There should only be 3 (localhost, DoH, DoT).

DL6ER commented 1 week ago

Looking at the logs, we are in a bit of a chicken-and-egg-problem as the database already has some broken stuff which then "contaminates" FTL's strings on the history import during startup. I'd suggest to disable database importing for the moment,

sudo pihole-FTL --config database.DBimport false

and then restart again. You may also want to clean out the log file first to get rid of the binary stuff that is in there right now.

edit The 100 clients come about because FTL detects clients based on their (string) IP address and when the IP address is garbage, a new client is added for each new garbage string.

bermeitinger-b commented 1 week ago

I've deleted the old database and started new. It generated the following log:

pihole  | 2024-11-12 14:19:15.726 CET [49M] DEBUG_ANY: dnsmasq received signal 17                                                                                                             
pihole  | 2024-11-12 14:19:15.934 CET [49M] DEBUG_SHMEM: Reusing existing string "eth0" at 13244 in _FTL_new_query() (src/dnsmasq_interface.c:906)                                            
pihole  | 2024-11-12 14:19:15.935 CET [49M] DEBUG_SHMEM: Reusing existing string "0" at 12 in get_client_groupids() (src/database/gravity-db.c:730)                                           
pihole  | 2024-11-12 14:19:15.935 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 24                                                                                                           
pihole  | 2024-11-12 14:19:15.936 CET [49M] DEBUG_SHMEM: Reusing existing string "0" at 12 in get_client_groupids() (src/database/gravity-db.c:730)                                           
pihole  | 2024-11-12 14:19:15.936 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 24                                                                                                           
pihole  | 2024-11-12 14:19:15.937 CET [49M] DEBUG_SHMEM: Reusing existing string "0" at 12 in get_client_groupids() (src/database/gravity-db.c:730)                                           
pihole  | 2024-11-12 14:19:15.938 CET [49M] DEBUG_SHMEM: LCM(4096, 1) == 4096 >= 24                                                                                                           
pihole  | 2024-11-12 14:19:16.183 CET [49M] DEBUG_SHMEM: Adding "star-mini.c10r.facebook.com" (len 28) to buffer in _findDomainID() (src/datastructure.c:270), next_str_pos is 27632          
pihole  | 2024-11-12 14:19:16.966 CET [49M] DEBUG_SHMEM: Reusing existing string "www.youtube.com" at 3122 in _findDomainID() (src/datastructure.c:270)                                       
pihole  | 2024-11-12 14:19:17.039 CET [49/T85] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!                                                               
pihole  | 2024-11-12 14:19:17.039 CET [49/T85] INFO: ---------------------------->  FTL crashed!  <----------------------------                                                               
pihole  | 2024-11-12 14:19:17.039 CET [49/T85] INFO: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

The rest is the same as in the first logs above. As above, it did not restart automatically.

The log file is not binary anymore, so the network table is looking correctly.

DL6ER commented 1 week ago

Did you have the debugger gdb attached? How does the backtrace look like on this crash?

bermeitinger-b commented 1 week ago

Maybe:

Thread 1 "pihole-FTL" received signal SIGSEGV, Segmentation fault.
0x0000000000559fed in pcache1FetchNoMutex (createFlag=2, iKey=1, p=0x75dbc36e5cb8)
    at /app/src/database/sqlite3.c:56250
warning: 56250  /app/src/database/sqlite3.c: No such file or directory
(gdb) bt
#0  0x0000000000559fed in pcache1FetchNoMutex (createFlag=2, iKey=1, p=0x75dbc36e5cb8)
    at /app/src/database/sqlite3.c:56250
#1  pcache1Fetch (p=0x75dbc36e5cb8, iKey=1, createFlag=2) at /app/src/database/sqlite3.c:56307
#2  0x00000000005d873d in sqlite3PcacheFetch (createFlag=3, pgno=1, pCache=<optimized out>)
    at /app/src/database/sqlite3.c:54725
#3  getPageNormal (pPager=0x75dbc3658448, pgno=1, ppPage=0x7ffdbb3923d0, flags=0)
    at /app/src/database/sqlite3.c:62716
#4  0x00000000005e20cf in sqlite3PagerGet (flags=0, ppPage=0x7ffdbb3923d0, pgno=1, 
    pPager=<optimized out>) at /app/src/database/sqlite3.c:62908
#5  btreeGetPage (flags=0, ppPage=<synthetic pointer>, pgno=1, pBt=0x75dbc3397a08)
    at /app/src/database/sqlite3.c:72974
#6  lockBtree (pBt=<optimized out>) at /app/src/database/sqlite3.c:73918
#7  btreeBeginTrans (p=0x75dbc36f8c98, wrflag=0, pSchemaVersion=0x7ffdbb392530)
    at /app/src/database/sqlite3.c:74313
#8  0x000000000063ff0c in sqlite3VdbeExec (p=p@entry=0x75dbc363da58)
    at /app/src/database/sqlite3.c:97438
#9  0x0000000000646ae0 in sqlite3Step (p=0x75dbc363da58) at /app/src/database/sqlite3.c:91357
#10 sqlite3_step (pStmt=pStmt@entry=0x75dbc363da58) at /app/src/database/sqlite3.c:91418
#11 0x00000000004910c7 in domain_in_list (
    domain=domain@entry=0x75dbc364acf0 "1.0.16.172.in-addr.arpa", stmt=0x75dbc363da58, 
    listname=listname@entry=0x805baf "whitelist", domain_id=domain_id@entry=0x75dbc2942fe8)
    at /app/src/database/gravity-db.c:1180
#12 0x0000000000493d28 in in_allowlist (
    domain=domain@entry=0x75dbc364acf0 "1.0.16.172.in-addr.arpa", 
    dns_cache=dns_cache@entry=0x75dbc2942fd8, client=client@entry=0x75dbc2d0eab0)
    at /app/src/database/gravity-db.c:1282
#13 0x000000000040bfcc in FTL_check_blocking (queryID=queryID@entry=10847, 
    domainID=domainID@entry=39, clientID=clientID@entry=1022) at /app/src/dnsmasq_interface.c:1618
#14 0x000000000040f17f in _FTL_new_query (flags=flags@entry=524296, name=<optimized out>, 
    addr=addr@entry=0x7ffdbb392970, arg=<optimized out>, qtype=<optimized out>, id=33202, 
    proto=UDP, file=0x8072a6 "/app/src/dnsmasq/forward.c", line=1822)
    at /app/src/dnsmasq_interface.c:960
#15 0x00000000004cb49f in receive_query (listen=listen@entry=0x75dbc36fce00, 
    now=now@entry=1731420001) at /app/src/dnsmasq/forward.c:1822
#16 0x00000000004b9063 in check_dns_listeners (now=now@entry=1731420001)

and right after:

#0  releaseMemArray (p=0x75dbc36e4920, N=1) at /app/src/database/sqlite3.c:87133
#1  0x000000000056614e in releaseMemArray (N=<optimized out>, p=<optimized out>)
    at /app/src/database/sqlite3.c:87130
#2  sqlite3VdbeClearObject (p=0x75dbc3643268, db=0x75dbc365a018)
    at /app/src/database/sqlite3.c:88686
#3  sqlite3VdbeDelete (p=0x75dbc3643268) at /app/src/database/sqlite3.c:88723
#4  0x00000000005ea36d in sqlite3_finalize (pStmt=0x75dbc3643268)
    at /app/src/database/sqlite3.c:90625
#5  0x0000000000490a32 in gravityDB_finalize_client_statements (client=0x75dbc28302a8)
    at /app/src/database/gravity-db.c:925
#6  gravityDB_close () at /app/src/database/gravity-db.c:964
#7  0x000000000049365e in gravityDB_close () at /app/src/database/gravity-db.c:956
#8  0x0000000000408451 in cleanup (ret=ret@entry=1) at /app/src/daemon.c:374
#9  0x0000000000425ac8 in signal_handler (sig=<optimized out>, si=0x7ffdbb391eb0, 
    unused=<optimized out>) at /app/src/signals.c:266
#10 <signal handler called>
#11 0x0000000000559fed in pcache1FetchNoMutex (createFlag=2, iKey=1, p=0x75dbc36e5cb8)
    at /app/src/database/sqlite3.c:56250
#12 pcache1Fetch (p=0x75dbc36e5cb8, iKey=1, createFlag=2) at /app/src/database/sqlite3.c:56307
#13 0x00000000005d873d in sqlite3PcacheFetch (createFlag=3, pgno=1, pCache=<optimized out>)
    at /app/src/database/sqlite3.c:54725
#14 getPageNormal (pPager=0x75dbc3658448, pgno=1, ppPage=0x7ffdbb3923d0, flags=0)
    at /app/src/database/sqlite3.c:62716
#15 0x00000000005e20cf in sqlite3PagerGet (flags=0, ppPage=0x7ffdbb3923d0, pgno=1, 
    pPager=<optimized out>) at /app/src/database/sqlite3.c:62908
#16 btreeGetPage (flags=0, ppPage=<synthetic pointer>, pgno=1, pBt=0x75dbc3397a08)
    at /app/src/database/sqlite3.c:72974
#17 lockBtree (pBt=<optimized out>) at /app/src/database/sqlite3.c:73918
#18 btreeBeginTrans (p=0x75dbc36f8c98, wrflag=0, pSchemaVersion=0x7ffdbb392530)
    at /app/src/database/sqlite3.c:74313
#19 0x000000000063ff0c in sqlite3VdbeExec (p=p@entry=0x75dbc363da58)
    at /app/src/database/sqlite3.c:97438
#20 0x0000000000646ae0 in sqlite3Step (p=0x75dbc363da58) at /app/src/database/sqlite3.c:91357
#21 sqlite3_step (pStmt=pStmt@entry=0x75dbc363da58) at /app/src/database/sqlite3.c:91418
#22 0x00000000004910c7 in domain_in_list (
    domain=domain@entry=0x75dbc364acf0 "1.0.16.172.in-addr.arpa", stmt=0x75dbc363da58, 
    listname=listname@entry=0x805baf "whitelist", domain_id=domain_id@entry=0x75dbc2942fe8)
    at /app/src/database/gravity-db.c:1180
#23 0x0000000000493d28 in in_allowlist (
    domain=domain@entry=0x75dbc364acf0 "1.0.16.172.in-addr.arpa", 
    dns_cache=dns_cache@entry=0x75dbc2942fd8, client=client@entry=0x75dbc2d0eab0)
    at /app/src/database/gravity-db.c:1282
#24 0x000000000040bfcc in FTL_check_blocking (queryID=queryID@entry=10847, 
    domainID=domainID@entry=39, clientID=clientID@entry=1022) at /app/src/dnsmasq_interface.c:1618
#25 0x000000000040f17f in _FTL_new_query (flags=flags@entry=524296, name=<optimized out>, 
    addr=addr@entry=0x7ffdbb392970, arg=<optimized out>, qtype=<optimized out>, id=33202, 
    proto=UDP, file=0x8072a6 "/app/src/dnsmasq/forward.c", line=1822)
    at /app/src/dnsmasq_interface.c:960
#26 0x00000000004cb49f in receive_query (listen=listen@entry=0x75dbc36fce00, 
    now=now@entry=1731420001) at /app/src/dnsmasq/forward.c:1822
#27 0x00000000004b9063 in check_dns_listeners (now=now@entry=1731420001)
    at /app/src/dnsmasq/dnsmasq.c:1895
#28 0x00000000004bb278 in main_dnsmasq (argc=<optimized out>, argv=<optimized out>)
    at /app/src/dnsmasq/dnsmasq.c:1278
#29 0x00000000004020d7 in main (argc=<optimized out>, argv=0x7ffdbb392d68) at /app/src/main.c:123

Then, the whole container dies.

Edit: After some time, the glyphs reappear in the network table. I don't have a backtrace for those but will check the log.

DL6ER commented 1 week ago

Edit: After some time, the glyphs reappear in the network table. I don't have a backtrace for those but will check the log.

Did anything come out here? I've been looking at the backtraces yesterday but it was too late to reply. Now, looking on them, they do not make a lot more sense TBH. Both crashes happened deeply down inside the embedded sqlite3 source code Image

but we don't know if this is cause or merely symptom of the string issues causing sqlite3 to trip over binary code in strings. Usually, it shouldn't but reporting sqlite3 bugs is a nightmare. Unless one can provide a clean and very small minimal-working-example (MWE) of the bug, they will not even try to investigate and lean back on "it is the application that is embedded sqlite3 that is causing issues".

We haven't been able to provide such a MWE for the one bug we reported back then but - fortunately - someone else did for some bug apparently caused by the same cause this allowed us to get the fix. Having said all this, I don't want to end in something like "sorry, we cannot do anything about it" but I furthermore don't see us really be able to do SQLite3 debugging - even worse if I cannot reproduce and "play" myself.

So let's look back at pihole-FTL and try to find a fix there. Did you already see the glyphs in the log file yesterday before the crash? If so, then this is definitely the next thing we need to fix. If you have the log file, I could provide you with an upload link via e-mail as I assume it'll be too large, even after gziping it. Just send me a quick mail to <my username here> @ pi-hole.net and I'll reply with a suitable link. Otherwise, if you want to share it via something, I'd be ready for downloading it from there, too.

bermeitinger-b commented 1 week ago

I fully understand and I'm thankful for your deep investigation. I can reproduce the crash by deleting pihole-FTL.db and recreate the container. This is the log after a few hours of running with at least 5 crashes. For personally identifying reasons, I've dropped a few rows and anonymized some private domains. They were correct domains.

https://0x0.st/Xk_3.log

DL6ER commented 1 week ago

Sorry for the delay, but I had a hard time thinking about what we could do. I did now prepare a new special branch debug/no_fork which detects the presence of an external debugger and prevents TCP forking in this case. Please check out this branch and continue testing. gdb should now be able to follow all the crashes

bermeitinger-b commented 1 week ago

Okay, thank you very much. How can I run this inside docker?

DL6ER commented 1 week ago

@bermeitinger-b Unfortunately, for docker it means a few extra steps. You'll have to build your own container like

git clone https://github.com/pi-hole/docker-pi-hole
cd docker-pi-hole
git checkout development # Future readers: This step is only needed until Pi-hole v6 is released
./build.sh -f debug/no_fork

and then using the generated image pihole:local instead of pihole/pihole:development (or :nightly) should be enough.

bermeitinger-b commented 1 week ago

Thanks, this worked perfectly and I'm running the debugging fork now. The log confirms this:

2024-11-15 14:00:15.669 CET [49M] INFO: ########## FTL started on 6c68c417781b! ##########
2024-11-15 14:00:15.669 CET [49M] INFO: FTL branch: debug/no_fork 
2024-11-15 14:00:15.669 CET [49M] INFO: FTL version: vDev-0d7e5ec
2024-11-15 14:00:15.669 CET [49M] INFO: FTL commit: 0d7e5ec7 
2024-11-15 14:00:15.669 CET [49M] INFO: FTL date: 2024-11-13 13:36:50 +0100
2024-11-15 14:00:15.669 CET [49M] INFO: FTL user: pihole
...
2024-11-15 14:00:54.900 CET [49/T137] INFO: Debugger attached (192: gdb), entering dnsmasq debug mode
...

In gdb, it shows many lines with

Thread 1 "pihole-FTL" received signal SIGPIPE, Broken pipe.

This is a SIGSEV that I've observed (right after I've restarted the upstream resolver, might be unrelated)

#0  get_meta (p=p@entry=0x7e8c21c3cff0 "\020\021") at src/malloc/mallocng/meta.h:141
#1  0x00000000007dfe62 in __libc_free (p=0x7e8c21c3cff0) at src/malloc/mallocng/free.c:105
#2  0x00000000005507c8 in pcache1FreePage (p=<optimized out>) at /app/src/database/sqlite3.c:55720
#3  pcache1TruncateUnsafe (pCache=pCache@entry=0x7e8c21d3a618, iLimit=iLimit@entry=1) at /app/src/database/sqlite3.c:55918
#4  0x0000000000550814 in pcache1Truncate (iLimit=1, p=0x7e8c21d3a618) at /app/src/database/sqlite3.c:56399
#5  pcache1Truncate (p=0x7e8c21d3a618, iLimit=<optimized out>) at /app/src/database/sqlite3.c:56395
#6  0x00000000005d716e in sqlite3PcacheClear (pCache=<optimized out>) at /app/src/database/sqlite3.c:55037
#7  pager_reset (pPager=0x7e8c21d3a7d8) at /app/src/database/sqlite3.c:58967
#8  pagerBeginReadTransaction (pPager=<optimized out>) at /app/src/database/sqlite3.c:60442
#9  sqlite3PagerSharedLock (pPager=<optimized out>) at /app/src/database/sqlite3.c:62608
#10 0x00000000005e26c9 in lockBtree (pBt=<optimized out>) at /app/src/database/sqlite3.c:73916
#11 btreeBeginTrans (p=0x7e8c22077b58, wrflag=1, pSchemaVersion=0x7e8c205467f0) at /app/src/database/sqlite3.c:74313
#12 0x000000000064052c in sqlite3VdbeExec (p=p@entry=0x7e8c21d4cf28) at /app/src/database/sqlite3.c:97438
#13 0x0000000000647100 in sqlite3Step (p=0x7e8c21d4cf28) at /app/src/database/sqlite3.c:91357
#14 sqlite3_step (pStmt=0x7e8c21d4cf28) at /app/src/database/sqlite3.c:91418
#15 0x00000000004a2489 in export_queries_to_disk (final=final@entry=false) at /app/src/database/query-table.c:625
#16 0x000000000049096a in DB_thread (val=<optimized out>) at /app/src/database/database-thread.c:143
#17 0x00000000007f2d10 in start (p=0x7e8c20546b00) at src/thread/pthread_create.c:207
#18 0x00000000007f439c in __clone () at src/thread/x86_64/clone.s:22
DL6ER commented 1 week ago

Okay, so this confirms a crash again deep inside sqlite3. A crash in get_meta following a free is very likely a double-free corruption. We'll need to bring another tool into the game to properly trace this: valgrind

We have prepared instructions for valgrind here: https://deploy-preview-338--pihole-docs.netlify.app/ftldns/valgrind/

Still sorry it is such a slow process...

bermeitinger-b commented 1 week ago

I don't think you're in a position to apologize ;) Thank you for your efforts.

I've started valgrind. Within docker, that's a bit complicated because stopping pihole-FTL stops the container. I've adjusted the docker-compose.yaml:

pihole:
    image: pihole:local
    ...
    entrypoint: /bin/bash
    tty: true

Then, I could start valgrind.

The error log is here: https://0x0.st/XkkG.txt

It indicates something at /sbin/ip address show.

pihole-FTL also dies with many Thread 4 received signal SIG33, Real-time event 33

I hope this helps.

DL6ER commented 1 week ago

No, this is actually fine. There will be a lot of output, the log (or gdb if you have re-attached it as described in the valgrind guide) will tell you once FTL crashes. The "errors" shown here are okay, valgrind is very verbose in in general logs many false-positives. What remains is still some good detective work once we have all the logs and the crash.

bermeitinger-b commented 1 week ago

I hope this leads somewhere. I can't really reproduce the state. This time, it crashes after 58 minutes and it's detached with error 33. I've removed the nostops for 33 (and 37).

2024-11-15 19:50:12.529 CET [3151/F35] INFO: Received signal: Segmentation fault
2024-11-15 19:50:12.530 CET [3151/F35] INFO:      at address: 0x489d3fc 
2024-11-15 19:50:12.532 CET [3151/F35] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)

And here the valgrind.log: https://0x0.st/Xk7C.log

There is one binary line in line 55 which I've removed. (If you need it, it's here https://0x0.st/Xk7F.bin)

DL6ER commented 1 week ago

Thanks. The important bit is

==3151== Invalid read of size 4
==3151==    at 0x41B27C: binsearch (lookup-table.c:94)
==3151==    by 0x41B27C: lookup_find_id (lookup-table.c:337)
==3151==    by 0x4095AB: _findCacheID (datastructure.c:475)
==3151==    by 0x40EDD1: _FTL_new_query (dnsmasq_interface.c:880)
==3151==    by 0x4C8E79: tcp_request (forward.c:2405)
==3151==    by 0x4B980C: check_dns_listeners (dnsmasq.c:2078)
==3151==    by 0x4BB897: main_dnsmasq (dnsmasq.c:1278)
==3151==    by 0x4020D6: main (main.c:123)
==3151==  Address 0x489d3fc is not stack'd, malloc'd or (recently) free'd

I did just push an update to the same debug branch (commit 26d1dd3), please re-build the container and 🤞

bermeitinger-b commented 6 days ago

Thanks, the new commit resulted in the probably same message:

459   │ ==2320== Invalid read of size 4
 460   │ ==2320==    at 0x41B27C: binsearch (lookup-table.c:94)
 461   │ ==2320==    by 0x41B27C: lookup_find_id (lookup-table.c:337)
 462   │ ==2320==    by 0x4095AB: _findCacheID (datastructure.c:475)
 463   │ ==2320==    by 0x40EDD1: _FTL_new_query (dnsmasq_interface.c:880)
 464   │ ==2320==    by 0x4C8E99: tcp_request (forward.c:2405)
 465   │ ==2320==    by 0x4B982C: check_dns_listeners (dnsmasq.c:2078)
 466   │ ==2320==    by 0x4BB8B7: main_dnsmasq (dnsmasq.c:1278)
 467   │ ==2320==    by 0x4020D6: main (main.c:123)
 468   │ ==2320==  Address 0x58767fc is not stack'd, malloc'd or (recently) free'd

Full log: https://0x0.st/XdHX.log

This time, I couldn't backtrace, maybe I disabled message?

I'm sure it's the correct version. From FTL.log:

2024-11-16 15:14:05.973 CET [33M] INFO: FTL branch: debug/no_fork
2024-11-16 15:14:05.976 CET [33M] INFO: FTL version: vDev-26d1dd3
2024-11-16 15:14:05.977 CET [33M] INFO: FTL commit: 26d1dd33
2024-11-16 15:14:05.978 CET [33M] INFO: FTL date: 2024-11-16 08:08:12 +0100
DL6ER commented 6 days ago

Too bad, so back to the drawing board. I did just push another update to the branch adding further debug output. It'd be good if you could update and try again. Once it crashes, we'd need the address the algorithm tried to access, like what you quoted from valgrind above and the last log line concerning what FTL tried to access, like

Inserting/Removing/Looking for element with ID ... and hash ... into/from/in ... lookup table (..., ... // .../.../...))

Please ensure you still have debug.shmem = true to see the additional output.

If you can attach both valgrind and gdb + we are lucky and the crash does not happen in a fork, it'd be interesting to get the following output in addition to backtrace:

print hash
print base
print mid
print size
print *try
print **try

edit The expected commit would be b4f194b3

bermeitinger-b commented 6 days ago

This time, gdb spammed with "Thread 4 event 33" and the CPU usage was so busy with gdb and valgrind that I had to restart the server. So, I could not run these gdb commands.

These are the last 7 minutes in FTL.log: https://0x0.st/Xdbb.txt

And this is the valgrind.log: https://0x0.st/XdbZ.log

DL6ER commented 6 days ago

Thanks. Unfortunately, this is another crash in another location without the "Invalid read of size 4" section so we don't get the address it tried to look at. I pushed another commit (59c22db3) that adds some extra out-of-bounds checking. Lets hope I'm not going down the wrong track here...

edit 1 It'll be interesting to see:

  1. Is there still a crash?
  2. if so, are there ERROR: ... lines in pihole-FTL.log ?

edit 2 I added another error message, updated the expected commit hash above

bermeitinger-b commented 5 days ago

Thanks.

I think it didn't crash where it could be traced. I've removed the SIG33 nostop but it's a different position:

Thread 6 received signal SIG33, Real-time event 33.                                                                                           
[Switching to Thread 132]                                                                                                                     
__cp_end () at src/thread/x86_64/syscall_cp.s:29                                                                                              
warning: 29     src/thread/x86_64/syscall_cp.s: No such file or directory                                                                     
(gdb) backtrace                                                                                                                               
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29                                                                                          
#1  0x00000000007f22e3 in __syscall_cp_c (nr=23, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>,                  
    y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33                                                                                 
#2  0x00000000007eb000 in select (n=n@entry=0, rfds=rfds@entry=0x0, wfds=wfds@entry=0x0, efds=efds@entry=0x0, tv=tv@entry=0x6122a00)          
    at src/select/select.c:39                                                                                                                 
#3  0x00000000006c1709 in FTLselect (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0,                               
    exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x6122a00, file=file@entry=0x8016f3 "/app/src/timers.c",                             
    func=0x8691c0 <__FUNCTION__.0> "sleepms", line=65) at /app/src/syscalls/select.c:25                                                       
#4  0x000000000042723a in sleepms (milliseconds=<optimized out>) at /app/src/timers.c:65                                                      
#5  0x0000000000426db1 in thread_sleepms (thread=thread@entry=GC, milliseconds=<optimized out>) at /app/src/signals.c:500                     
#6  0x0000000000418477 in GC_thread (val=<optimized out>) at /app/src/gc.c:659                                                                
#7  0x00000000007f2fd0 in start (p=0x6122b00) at src/thread/pthread_create.c:207                                                              
#8  0x00000000007f465c in __clone () at src/thread/x86_64/clone.s:22                                                                          
Backtrace stopped: frame did not save the PC   
(gdb) print hash                                                                                                           10:01:39 [34/47899]
$1 = (const nettle_hash *) 0xb09f60 <nettle_sha256>
(gdb) print base
No symbol "base" in current context.
(gdb) print mid
No symbol "mid" in current context. 
(gdb) print size
$2 = 0
(gdb) print *try
No symbol "try" in current context. 
(gdb) print **try
No symbol "try" in current context. 
(gdb) continue
Continuing.

Thread 6 received signal SIG33, Real-time event 33.
__cp_end () at src/thread/x86_64/syscall_cp.s:29
29      in src/thread/x86_64/syscall_cp.s
(gdb) backtrac
#0  __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1  0x00000000007f22e3 in __syscall_cp_c (nr=23, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, 
    y=<optimized out>, z=0) at src/thread/pthread_cancel.c:33
#2  0x00000000007eb000 in select (n=n@entry=0, rfds=rfds@entry=0x0, wfds=wfds@entry=0x0, efds=efds@entry=0x0, tv=tv@entry=0x6122a00)
    at src/select/select.c:39
#3  0x00000000006c1709 in FTLselect (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, 
    exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x6122a00, file=file@entry=0x8016f3 "/app/src/timers.c", 
    func=0x8691c0 <__FUNCTION__.0> "sleepms", line=65) at /app/src/syscalls/select.c:25
#4  0x000000000042723a in sleepms (milliseconds=<optimized out>) at /app/src/timers.c:65
#5  0x0000000000426db1 in thread_sleepms (thread=thread@entry=GC, milliseconds=<optimized out>) at /app/src/signals.c:500
#6  0x0000000000418477 in GC_thread (val=<optimized out>) at /app/src/gc.c:659
#7  0x00000000007f2fd0 in start (p=0x6122b00) at src/thread/pthread_create.c:207
#8  0x00000000007f465c in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
(gdb) continue
Continuing.

(Then, it's dead.)

valgrind.log: https://0x0.st/Xdbd.log

These are the last lines in FTL.log: https://0x0.st/Xdbn.txt

DL6ER commented 5 days ago

Yeah, it shouldn't stop at SIG33. When valgrind is attached, it uses SIG33 for internal messaging. Even a brief interruption like in your case causes valgrind to have the hiccups. Interestingly, I do not see some SIG33s but certainly not a flood of them even when I have valgrind and gdb running - but I don't think this is related.

DL6ER commented 5 days ago

Okay, I think I have something new. With red, very dry eyes, I have been staring at your previous log and found the following interesting lines:

2024-11-13 13:47:34.520 CET [957/F52] DEBUG_SHMEM: Resizing "FTL-domains-lookup" from 4096 to (1024 * 8) == 8192 (2.2MB used, 67.1MB total, FTL uses 2.2MB)
2024-11-13 13:47:34.520 CET [957/F52] DEBUG_SHMEM: SHMEM pointer updated: 0x7bb3e5d08000 -> 0x7bb3e56e4000 (4096 8192)
[...]
2024-11-13 13:47:34.521 CET [957/F52] INFO: Received signal: Segmentation fault
2024-11-13 13:47:34.521 CET [957/F52] INFO:      at address: 0x7bb3e5d080ec
2024-11-13 13:47:34.521 CET [957/F52] INFO:      with code:  SEGV_MAPERR (Address not mapped to object)

Okay ... so it tried to access memory at 0x7bb3e5d080ec which belongs to the old memory location 0x7bb3e5d08000 even though it logged one millisecond above that the shared memory pointer moved to 0x7bb3e56e4000.

I am actually quite confident that adding the missing remapping instructions (commit 341464fb) fixes the bug we are seeing here. It also explains why it mostly happens in forks and why I haven't seen it myself before (my network is simply to quiet/not enough queries).

bermeitinger-b commented 5 days ago

I take my hat off to you while bowing with respect. You did it. 👍

With this branch, it's running since 10 hours without a crash. Thank you very much.

schuettecarsten commented 4 days ago

I am actually quite confident that adding the missing remapping instructions (commit 341464f) fixes the bug we are seeing here. It also explains why it mostly happens in forks and why I haven't seen it myself before (my network is simply to quiet/not enough queries).

I just looked at your code. Is this code thread-safe? Or is thread safety no issue because of a global "lock" somewhere?

DL6ER commented 4 days ago

Thank you for confirming the fix!

I have now split this into two branches: The one you are currently on (debug/no_fork) got stripped down to the extra code ensuring no forking is happening whenever gdb is attached to ease future debugging. Furthermore, I have created a new branch called fix/binsearch_shm that has the fix for shared memory remapping.

I did that to ease review as these are basically two independent changes which should result in two independent pull requests (#2120 and #2121 ). I also removed a few of the commits adding more debug output which we didn't need in the end. No need to keep "dead" code around.


@schuettecarsten

I just looked at your code. Is this code thread-safe? Or is thread safety no issue because of a global "lock" somewhere?

We take care of only accessing these global memory sections when a global (shared-memory) lock is helt (lock_shm() and unlock_shm()). This to not only ensure thread-safety but even fork-safety as FTL does multi-processing both in threads and processes. We try to keep the overhead as small as possible by only holding the lock in sections that really need it. So far, we have never identified the lock to be an issue as most systems are idle for so long that the lock is basically free to be taken right away for the vast majority of time.

DL6ER commented 3 days ago

The bug should be fixed in development by now. For docker users, the fix is in :nightly. I will close this as completed. Thank you all for your input and helping us making Pi-hole better for us all. Much appreciated!