pi-hole / FTL

The Pi-hole FTL engine
https://pi-hole.net
Other
1.36k stars 193 forks source link

FTL not auto-starting after device reboot. Manual restart in webgui required to resume resolving. #1030

Closed cmdshft closed 3 years ago

cmdshft commented 3 years ago

Versions

Pi-hole version is v5.2.3 (Latest: v5.2.3) AdminLTE version is v5.3 (Latest: v5.3) FTL version is v5.4 (Latest: v5.4)

Platform

Expected behavior

FTL to auto start properly after system restart.

Actual behavior / bug

When restarting the Raspberry Pi device, FTL seems to not start as my Amplifi Alien will immediately report an issue with DNS IPv4 configuration. Logging into the Pi-Hole webgui and manually restarting the DNS resolver rectifies this issue and my router will report correct DNS IPv4 configuration within seconds.

Steps to reproduce

Steps to reproduce the behavior:

  1. Restart device.
  2. Disable ad-blocker in browser and then browse to an ad-testing site.
  3. Observe ads.
  4. Log into Pi-hole webgui, manually restart DNS resolver.
  5. Reload ad-testing site.
  6. Observe ads not being loaded.

Debug Token

Screenshot 2021-01-18 002937

As you can see, there is a large amount of time where there is no resolving being done. The Pi device was rebooted far into the past. Where it begins resolving is when you see the large uptick in the graphs. You can see where I had done two reboots further into the graphs as well, and where those resume are exactly when FTL began resolving again after manually restarting FTL in the webgui.

cmdshft commented 3 years ago

Here is a token after restarting the device but not manually restarting FTL: https://tricorder.pi-hole.net/jw0h5u8byr

cmdshft commented 3 years ago

After today's update, I again have to restart FTL manually and pihole -r did not fix this.

DL6ER commented 3 years ago

Sorry, I haven't been able to investigate on this so far. You previous debug token already expired and was removed from the server. Could you generate a new one after restarting your device?

cmdshft commented 3 years ago

Log after restarting DNS resolver: https://pastebin.com/n4KLRntb

Log after rebooting without restarting DNS resolver: https://pastebin.com/KWaRXDhm

DL6ER commented 3 years ago

Thanks, I'm not sure what is going wrong exactly, however, few things I noticed:

  1. There are many lines like

    WARN: Could not write() everything in ssend() [/root/project/src/api/socket.c:229]: Broken pipe

    Do you use a custom script to gather data from the Pi-hole? I haven't seen this before. (it is not directly related to the issue you reported)

  2. The last line when it became unresponsive is

    Compiled 0 whitelist and 0 blacklist regex filters for 25 clients in 9.9 msec

What we can do next is adding

DEBUG_ALL=true

to the file /etc/pihole/pihole-FTL.conf (create if it does not exist) and restart once again. This should make FTL a lot more verbose. You do not need to generate a debug log, it's also fine if you just copy the last lines from FTL's log. You can get them via

tail -n 100 /var/log/pihole-FTL.log

I'd then suggest to set DEBUG_ALL=false in the file above before you restart the DNS resolver. With all debug settings enabled, the log will grow in file size really fast, possibly filling your disk within hours. We shouldn't do that ;-)

cmdshft commented 3 years ago

Do you use a custom script to gather data from the Pi-hole? I haven't seen this before. (it is not directly related to the issue you reported)

I do not use anything other than Pi-Hole, PADD, Plex Media Server and Tautulli on this device. No custom mods to any of those.

What we can do next is adding

DEBUG_ALL=true

to the file /etc/pihole/pihole-FTL.conf (create if it does not exist) and restart once again. This should make FTL a lot more verbose. You do not need to generate a debug log, it's also fine if you just copy the last lines from FTL's log. You can get them via

tail -n 100 /var/log/pihole-FTL.log

I'd then suggest to set DEBUG_ALL=false in the file above before you restart the DNS resolver. With all debug settings enabled, the log will grow in file size really fast, possibly filling your disk within hours. We shouldn't do that ;-)

Okay here is the log:

[2021-01-25 06:54:10.491 717M] --> Querying client table for f6:92:bf:70:65:c9 [2021-01-25 06:54:10.492 717M] --> There is no record for f6:92:bf:70:65:c9 in the client table [2021-01-25 06:54:10.492 717M] Querying gravity database for host name of 192.168.1.1... [2021-01-25 06:54:10.492 717M] Found database host name (same address) 192.168.1.1 -> amplifi.lan [2021-01-25 06:54:10.492 717M] --> Querying client table for amplifi.lan [2021-01-25 06:54:10.492 717M] --> There is no record for amplifi.lan in the client table [2021-01-25 06:54:10.492 717M] Querying gravity database for interface of 192.168.1.1... [2021-01-25 06:54:10.492 717M] getDatabaseHostname(): "SELECT interface FROM network JOIN network_addresses ON network_addresses.network_id = network.id WHERE network_addresses.ip = ? AND interface != 'N/A' AND interface IS NOT NULL;" with ? = "192.168.1.1" [2021-01-25 06:54:10.492 717M] Found database interface 192.168.1.1 -> eth0 [2021-01-25 06:54:10.492 717M] Querying client table for interface :eth0 [2021-01-25 06:54:10.492 717M] --> There is no record for interface :eth0 in the client table [2021-01-25 06:54:10.492 717M] Gravity database: Client amplifi.lan not found. Using default group.

[2021-01-25 06:54:10.492 717M] Adding "0" (len 2) to buffer. next_str_pos is 37398 [2021-01-25 06:54:10.492 717M] gravityDB_open(): Preparing vw_whitelist statement for client 192.168.1.1 [2021-01-25 06:54:10.492 717M] get_client_querystr: SELECT EXISTS(SELECT domain from vw_whitelist WHERE domain = ? AND group_id IN (0)); [2021-01-25 06:54:10.493 717M] Setting sqlite3_stmt 0x7344e7f0[2] <-- 0x15e94c8 [2021-01-25 06:54:10.493 717M] gravityDB_open(): Preparing vw_gravity statement for client 192.168.1.1 [2021-01-25 06:54:10.493 717M] get_client_querystr: SELECT EXISTS(SELECT domain from vw_gravity WHERE domain = ? AND group_id IN (0)); [2021-01-25 06:54:10.493 717M] Setting sqlite3_stmt 0x7344fc80[2] <-- 0x1607268 [2021-01-25 06:54:10.493 717M] gravityDB_open(): Preparing vw_blacklist statement for client 192.168.1.1 [2021-01-25 06:54:10.493 717M] get_client_querystr: SELECT EXISTS(SELECT domain from vw_blacklist WHERE domain = ? AND group_id IN (0)); [2021-01-25 06:54:10.493 717M] Setting sqlite3_stmt 0x7344f238[2] <-- 0x1606db0 [2021-01-25 06:54:10.493 717M] Getting sqlite3_stmt 0x7344e7f0[2] --> 0x15e94c8 [2021-01-25 06:54:10.494 717M] domain_in_list("www.gstatic.com", 0x15e94c8, whitelist): 0 [2021-01-25 06:54:10.494 717M] Getting sqlite3_stmt 0x7344f238[2] --> 0x1606db0 [2021-01-25 06:54:10.494 717M] domain_in_list("www.gstatic.com", 0x1606db0, blacklist): 0 [2021-01-25 06:54:10.494 717M] Getting sqlite3_stmt 0x7344fc80[2] --> 0x1607268 [2021-01-25 06:54:10.506 717M] domain_in_list("www.gstatic.com", 0x1607268, gravity): 0 [2021-01-25 06:54:10.506 717M] Removed lock in _FTL_new_query() (/root/project/src/dnsmasq_interface.c:756) [2021-01-25 06:54:10.506 717M] Waiting for lock in _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:810) [2021-01-25 06:54:10.506 717M] Obtained lock for _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:810) [2021-01-25 06:54:10.507 717M] forwarded www.gstatic.com to 8.8.4.4#53 (ID 31, /root/project/src/dnsmasq/forward.c:598) [2021-01-25 06:54:10.507 717M] Removed lock in _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:930) [2021-01-25 06:54:10.507 717M] Waiting for lock in _FTL_new_query() (/root/project/src/dnsmasq_interface.c:536) [2021-01-25 06:54:10.507 717M] Obtained lock for _FTL_new_query() (/root/project/src/dnsmasq_interface.c:536) [2021-01-25 06:54:10.507 717M] new UDP query[A] query "www.google.org" from eth0:192.168.1.1 (ID 32, FTL 43733, /root/project/src/dnsmasq/forward.c:1624) [2021-01-25 06:54:10.507 717M] getOverTimeID(1611557700): 149 [2021-01-25 06:54:10.507 717M] www.google.org is not known [2021-01-25 06:54:10.507 717M] Getting sqlite3_stmt 0x7344e7f0[2] --> 0x15e94c8 [2021-01-25 06:54:10.507 717M] domain_in_list("www.google.org", 0x15e94c8, whitelist): 0 [2021-01-25 06:54:10.507 717M] Getting sqlite3_stmt 0x7344f238[2] --> 0x1606db0 [2021-01-25 06:54:10.507 717M] domain_in_list("www.google.org", 0x1606db0, blacklist): 0 [2021-01-25 06:54:10.507 717M] Getting sqlite3_stmt 0x7344fc80[2] --> 0x1607268 [2021-01-25 06:54:10.514 717M] domain_in_list("www.google.org", 0x1607268, gravity): 0 [2021-01-25 06:54:10.514 717M] Removed lock in _FTL_new_query() (/root/project/src/dnsmasq_interface.c:756) [2021-01-25 06:54:10.515 717M] Waiting for lock in _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:810) [2021-01-25 06:54:10.515 717M] Obtained lock for _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:810) [2021-01-25 06:54:10.515 717M] ** forwarded www.google.org to 8.8.4.4#53 (ID 32, /root/project/src/dnsmasq/forward.c:598) [2021-01-25 06:54:10.515 717M] Removed lock in _FTL_forwarded() (/root/project/src/dnsmasq_interface.c:930) [2021-01-25 06:54:10.520 717M] Waiting for lock in _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:342) [2021-01-25 06:54:10.520 717M] Obtained lock for _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:342) [2021-01-25 06:54:10.521 717M] www.gstatic.com is known as not to be blocked [2021-01-25 06:54:10.521 717M] CNAME www.gstatic.com [2021-01-25 06:54:10.521 717M] Removed lock in _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:453) [2021-01-25 06:54:10.521 717M] Waiting for lock in _FTL_reply() (/root/project/src/dnsmasq_interface.c:982) [2021-01-25 06:54:10.521 717M] Obtained lock for _FTL_reply() (/root/project/src/dnsmasq_interface.c:982) [2021-01-25 06:54:10.521 717M] got reply www.gstatic.com is 172.217.10.227 (ID 31, /root/project/src/dnsmasq/cache.c:494) [2021-01-25 06:54:10.521 717M] Flags: F_FORWARD F_IPV4 [2021-01-25 06:54:10.521 717M] Removed lock in _FTL_reply() (/root/project/src/dnsmasq_interface.c:1121) [2021-01-25 06:54:10.539 717M] Waiting for lock in _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:342) [2021-01-25 06:54:10.540 717M] Obtained lock for _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:342) [2021-01-25 06:54:10.540 717M] www.google.org is known as not to be blocked [2021-01-25 06:54:10.540 717M] CNAME www.google.org [2021-01-25 06:54:10.540 717M] Removed lock in _FTL_CNAME() (/root/project/src/dnsmasq_interface.c:453) [2021-01-25 06:54:10.540 717M] Waiting for lock in _FTL_reply() (/root/project/src/dnsmasq_interface.c:982) [2021-01-25 06:54:10.540 717M] Obtained lock for _FTL_reply() (/root/project/src/dnsmasq_interface.c:982) [2021-01-25 06:54:10.540 717M] got reply www.google.org is 216.239.32.27 (ID 32, /root/project/src/dnsmasq/cache.c:494) [2021-01-25 06:54:10.540 717M] Flags: F_FORWARD F_IPV4 [2021-01-25 06:54:10.540 717M] Removed lock in _FTL_reply() (/root/project/src/dnsmasq_interface.c:1121) [2021-01-25 06:54:11.568 717/T1230] Waiting for lock in process_request() (/root/project/src/api/request.c:38) [2021-01-25 06:54:11.568 717/T1230] Obtained lock for process_request() (/root/project/src/api/request.c:38) [2021-01-25 06:54:11.569 717/T1230] Removed lock in process_request() (/root/project/src/api/request.c:40) [2021-01-25 06:54:11.581 717/T1234] Waiting for lock in process_request() (/root/project/src/api/request.c:149) [2021-01-25 06:54:11.581 717/T1234] Obtained lock for process_request() (/root/project/src/api/request.c:149) [2021-01-25 06:54:11.582 717/T1234] Removed lock in process_request() (/root/project/src/api/request.c:151) [2021-01-25 06:54:11.676 717/T1266] Waiting for lock in process_request() (/root/project/src/api/request.c:94) [2021-01-25 06:54:11.677 717/T1266] Obtained lock for process_request() (/root/project/src/api/request.c:94) [2021-01-25 06:54:11.677 717/T1266] Removed lock in process_request() (/root/project/src/api/request.c:96) [2021-01-25 06:54:11.689 717/T1272] Waiting for lock in process_request() (/root/project/src/api/request.c:52) [2021-01-25 06:54:11.689 717/T1272] Obtained lock for process_request() (/root/project/src/api/request.c:52) [2021-01-25 06:54:11.689 717/T1272] initial: conflinebuffer = (nil), keystr = 0x728006a8, size = 0 [2021-01-25 06:54:11.689 717/T1272] conflinebuffer = 0x72801b48, keystr = 0x728006a8, size = 120 [2021-01-25 06:54:11.689 717/T1272] while reading line "PRIVACYLEVEL=0 " looking for "PRIVACYLEVEL=" [2021-01-25 06:54:11.690 717/T1272] Removed lock in process_request() (/root/project/src/api/request.c:54) [2021-01-25 06:54:11.703 717/T1278] Waiting for lock in process_request() (/root/project/src/api/request.c:52) [2021-01-25 06:54:11.703 717/T1278] Obtained lock for process_request() (/root/project/src/api/request.c:52) [2021-01-25 06:54:11.703 717/T1278] initial: conflinebuffer = (nil), keystr = 0x72800840, size = 0 [2021-01-25 06:54:11.703 717/T1278] conflinebuffer = 0x72801c98, keystr = 0x72800840, size = 120 [2021-01-25 06:54:11.703 717/T1278] while reading line "PRIVACYLEVEL=0 " looking for "PRIVACYLEVEL=" [2021-01-25 06:54:11.704 717/T1278] Removed lock in process_request() (/root/project/src/api/request.c:54) [2021-01-25 06:54:11.716 717/T1282] Waiting for lock in process_request() (/root/project/src/api/request.c:59) [2021-01-25 06:54:11.717 717/T1282] Obtained lock for process_request() (/root/project/src/api/request.c:59) [2021-01-25 06:54:11.717 717/T1282] initial: conflinebuffer = (nil), keystr = 0x72800840, size = 0 [2021-01-25 06:54:11.717 717/T1282] conflinebuffer = 0x72800688, keystr = 0x72800840, size = 120 [2021-01-25 06:54:11.717 717/T1282] while reading line "PRIVACYLEVEL=0 " looking for "PRIVACYLEVEL=" [2021-01-25 06:54:11.717 717/T1282] Removed lock in process_request() (/root/project/src/api/request.c:61)

dschaper commented 3 years ago

PADD

Which version of PADD?

cmdshft commented 3 years ago

Always the latest version.

On Jan 25, 2021, at 12:50, Dan Schaper notifications@github.com wrote:

 PADD

Which version of PADD?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

dschaper commented 3 years ago

Humor me, what version?

cmdshft commented 3 years ago

3.5.1

DL6ER commented 3 years ago

Okay here is the log

Is this really from when FTL stopped working? It long pretty file and I don't see the line you mentioned before:

Compiled 0 whitelist and 0 blacklist regex filters for 25 clients in 9.9 msec

cmdshft commented 3 years ago

I did the command as instructed immediately after my pi-hole was up and running like I would expect it and the router was complaining that the DNS was down. That was the end of that log that was requested.

bendschs commented 3 years ago

sam problem here, on pi4 debian buster

cmdshft commented 3 years ago

New update installed, still same issue. Router complained DNS down after a reboot of the pihole. Restarting DNS resolver manually fixes the issue.

On Jan 27, 2021, at 16:10, bendschs notifications@github.com wrote:

 sam problem here, on pi4 debian buster

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

DL6ER commented 3 years ago

This is still very strange because the log looks like everything is fine. The only thing I could imagine right now is your networking interface going down for a short moment and FTL losing contact to the outside world hereby. And fails to recover from this. But that's 120% hypothetical.

Could you (everyone affected) also share the last lines of /var/log/pihole.log when it happens? The more logs we have, the more we could maybe see.

cmdshft commented 3 years ago

The only thing I could imagine right now is your networking interface going down for a short moment and FTL losing contact to the outside world hereby. And fails to recover from this.

But, during boot (I have it set to be verbose), the network interface is working properly. It connects to the network fine, and it's very clearly working because PADD reports it as such when it runs it's checks. Only FTL itself seems to have the issue, and only after reboots. Once I manually restart the DNS resolver in the web admin, everything is working great.

Could you (everyone affected) also share the last lines of /var/log/pihole.log when it happens? The more logs we have, the more we could maybe see.

Here is the log that shows when I triggered a reboot in terminal:

> Jan 28 05:32:09 dnsmasq[4974]: exiting on receipt of SIGTERM
> Jan 28 05:32:40 dnsmasq[644]: started, version pi-hole-2.84 cachesize 10000
> Jan 28 05:32:40 dnsmasq[644]: DNS service limited to local subnets
> Jan 28 05:32:40 dnsmasq[644]: compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile
> Jan 28 05:32:40 dnsmasq[644]: using only locally-known addresses for domain use-application-dns.net
> Jan 28 05:32:40 dnsmasq[644]: using nameserver 192.168.1.1#53 for domain 1.168.192.in-addr.arpa 
> Jan 28 05:32:40 dnsmasq[644]: using nameserver 1.0.0.1#53
> Jan 28 05:32:40 dnsmasq[644]: using nameserver 1.1.1.1#53
> Jan 28 05:32:40 dnsmasq[644]: using nameserver 8.8.4.4#53
> Jan 28 05:32:40 dnsmasq[644]: using nameserver 8.8.8.8#53
> Jan 28 05:32:40 dnsmasq[644]: read /etc/hosts - 5 addresses
> Jan 28 05:32:40 dnsmasq[644]: read /etc/pihole/custom.list - 0 addresses
> Jan 28 05:32:40 dnsmasq[644]: read /etc/pihole/local.list - 2 addresses
> Jan 28 05:32:41 dnsmasq[644]: Ignoring query from non-local network
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 188.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 188.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.188 is amazon-18b3d2e2f.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 37.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 37.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.37 is RingHpCam-37.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 98.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 98.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.98 is RokuPremiere.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 2.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 2.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.2 is ExodusBlack.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 74.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 74.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.74 is amazon-7e1b7bed1.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 1.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 1.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.1 is amplifi.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 171.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 171.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.171 is iPhone.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 18.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 18.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.18 is AppleTV.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 96.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 96.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.96 is RokuPremiere.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 194.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 194.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.194 is Chriss-iPhone.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 51.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 51.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.51 is Kristals-iPhone.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 101.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 101.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.101 is RokuPlayer.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 82.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 82.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.82 is XBOXONE.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 254.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 254.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.254 is AddiesiPadMini4.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 247.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 247.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.247 is Ring-1dc73e.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 103.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 103.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.103 is Graysons-iPhone.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 172.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 172.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.172 is ChrissApleWatch.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 79.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 79.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.79 is amazon-5d87bd4b2.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 121.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 121.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 47.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 47.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.47 is ShadowofIntent.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 80.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 80.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 244.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 244.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 133.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 133.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 157.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 157.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.157 is Samsung.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 69.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 69.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.69 is Ring-3b493e.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 195.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 195.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 185.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 185.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:42 dnsmasq[644]: reply 192.168.1.185 is 5CD714159Q.lan
> Jan 28 05:32:42 dnsmasq[644]: query[PTR] 89.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:42 dnsmasq[644]: forwarded 89.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 204.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 204.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:43 dnsmasq[644]: reply 192.168.1.204 is Chriss-iPhone.lan
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 4.4.8.8.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 4.4.8.8.in-addr.arpa to 1.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 4.4.8.8.in-addr.arpa to 1.1.1.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 4.4.8.8.in-addr.arpa to 8.8.4.4
> Jan 28 05:32:43 dnsmasq[644]: forwarded 4.4.8.8.in-addr.arpa to 8.8.8.8
> Jan 28 05:32:43 dnsmasq[644]: reply 8.8.4.4 is dns.google
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 1.1.168.192.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 1.1.168.192.in-addr.arpa to 192.168.1.1
> Jan 28 05:32:43 dnsmasq[644]: reply 192.168.1.1 is amplifi.lan
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 8.8.8.8.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 8.8.8.8.in-addr.arpa to 8.8.8.8
> Jan 28 05:32:43 dnsmasq[644]: reply 8.8.8.8 is dns.google
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 1.0.0.1.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 1.0.0.1.in-addr.arpa to 8.8.8.8
> Jan 28 05:32:43 dnsmasq[644]: reply 1.0.0.1 is one.one.one.one
> Jan 28 05:32:43 dnsmasq[644]: query[PTR] 1.1.1.1.in-addr.arpa from 127.0.0.1
> Jan 28 05:32:43 dnsmasq[644]: forwarded 1.1.1.1.in-addr.arpa to 8.8.8.8
> Jan 28 05:32:43 dnsmasq[644]: reply 1.1.1.1 is one.one.one.one
DL6ER commented 3 years ago

during boot (I have it set to be verbose), the network interface is working properly

Yes, my idea was if something happens shortly after finished booting (which does never happen again).

Also your new log except showed that FTL indeed worked as expected until sometime happened because of which it doesn't see any further queries.

jmlingeman commented 3 years ago

I’ve been having the same issue for the past couple months in PiHole using the pihole docker image.

jathor007 commented 3 years ago

Just started having this issue (or at least just noticed it because i reboot infrequently). Reinstalled both raspberry pi os and pi-hole and issue still persist. I have to restart dns resolver for it to work after rebooting the rpi.

DL6ER commented 3 years ago

@jathor007 which device are you using? Did you install anything our configured anything besides Pi-hole? I bought a couple of devices for testing but couldn't reproduce this at all, so far.

jathor007 commented 3 years ago

@DL6ER it's an rpi4 4gb. Before reinstalling i had plex and minidlna but when i reinstalled the os i only have pi-hole on there. I'll try another reinstall this weekend and get back to you

sivuda commented 3 years ago

I have this issue as well. I keep getting disconnected and the only way to go back online was to restartdns and everytime I reboot, I need to do restartdns. I have it on rpi4 4gb along with unbound and pivpn, motioneye. I also had the issue with rpi3b with just pihole, unbound and pivpn. Rpi4 was a fresh install few days ago

lrkrol commented 3 years ago

Have this same issue today on a Raspberry Pi 4 Model B Rev 1.4, running Raspbian GNU/Linux 10 (buster). Did not have it when running Pi-hole on Xubuntu a few days ago before the Raspberry arrived.

Could you (everyone affected) also share the last lines of /var/log/pihole.log when it happens? The more logs we have, the more we could maybe see.

Here are a few lines covering the period of working, rebooting, and not working anymore after reboot:

Feb 20 23:26:29 dnsmasq[1397]: reply user-images.githubusercontent.com is 185.199.108.133
Feb 20 23:26:29 dnsmasq[1397]: reply user-images.githubusercontent.com is 185.199.109.133
Feb 20 23:26:29 dnsmasq[1397]: reply user-images.githubusercontent.com is 185.199.110.133
Feb 20 23:26:29 dnsmasq[1397]: reply user-images.githubusercontent.com is 185.199.111.133
Feb 20 23:26:30 dnsmasq[1397]: query[A] collector.githubapp.com from 192.168.1.3
Feb 20 23:26:30 dnsmasq[1397]: gravity blocked collector.githubapp.com is 0.0.0.0
Feb 20 23:27:00 dnsmasq[1397]: query[A] ocsp.root-x1.letsencrypt.org from 192.168.1.3
Feb 20 23:27:00 dnsmasq[1397]: forwarded ocsp.root-x1.letsencrypt.org to 208.67.220.220
Feb 20 23:27:00 dnsmasq[1397]: reply ocsp.root-x1.letsencrypt.org is <CNAME>
Feb 20 23:27:00 dnsmasq[1397]: reply ocsp.le-org.edgesuite.net is <CNAME>
Feb 20 23:27:00 dnsmasq[1397]: reply a1914.dscq.akamai.net is 2.18.212.27
Feb 20 23:27:00 dnsmasq[1397]: reply a1914.dscq.akamai.net is 2.18.212.57
Feb 20 23:29:12 dnsmasq[1397]: exiting on receipt of SIGTERM
Feb 20 23:29:20 dnsmasq[635]: started, version pi-hole-2.84 cachesize 10000
Feb 20 23:29:20 dnsmasq[635]: DNS service limited to local subnets
Feb 20 23:29:20 dnsmasq[635]: compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile
Feb 20 23:29:20 dnsmasq[635]: using only locally-known addresses for domain use-application-dns.net
Feb 20 23:29:20 dnsmasq[635]: using nameserver 208.67.220.220#53
Feb 20 23:29:20 dnsmasq[635]: using nameserver 208.67.222.222#53
Feb 20 23:29:20 dnsmasq[635]: read /etc/hosts - 5 addresses
Feb 20 23:29:20 dnsmasq[635]: read /etc/pivpn/hosts.wireguard - 1 addresses
Feb 20 23:29:20 dnsmasq[635]: read /etc/pihole/custom.list - 0 addresses
Feb 20 23:29:20 dnsmasq[635]: read /etc/pihole/local.list - 2 addresses
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 3.1.168.192.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 3.1.168.192.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 10.1.168.192.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: /etc/pihole/local.list 192.168.1.10 is rabbi
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 2.0.6.10.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: /etc/pivpn/hosts.wireguard 10.6.0.2 is laurens-phone.pivpn
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 220.220.67.208.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 220.220.67.208.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 222.222.67.208.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:22 dnsmasq[635]: query[PTR] 222.222.67.208.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED
Feb 20 23:29:47 dnsmasq[635]: Ignoring query from non-local network

And then, after rebooting the DNS resolver:

Feb 20 23:35:08 dnsmasq[635]: exiting on receipt of SIGTERM
Feb 20 23:35:08 dnsmasq[1327]: started, version pi-hole-2.84 cachesize 10000
Feb 20 23:35:08 dnsmasq[1327]: DNS service limited to local subnets
Feb 20 23:35:08 dnsmasq[1327]: compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile
Feb 20 23:35:08 dnsmasq[1327]: using only locally-known addresses for domain use-application-dns.net
Feb 20 23:35:08 dnsmasq[1327]: using nameserver 208.67.220.220#53
Feb 20 23:35:08 dnsmasq[1327]: using nameserver 208.67.222.222#53
Feb 20 23:35:08 dnsmasq[1327]: read /etc/hosts - 5 addresses
Feb 20 23:35:08 dnsmasq[1327]: read /etc/pivpn/hosts.wireguard - 1 addresses
Feb 20 23:35:08 dnsmasq[1327]: read /etc/pihole/custom.list - 0 addresses
Feb 20 23:35:08 dnsmasq[1327]: read /etc/pihole/local.list - 2 addresses
Feb 20 23:35:09 dnsmasq[1327]: query[A] chat-pa.clients6.google.com from 192.168.1.3
DL6ER commented 3 years ago

@lrkrol Do you not see lines like the

Feb 20 23:29:22 dnsmasq[635]: query[PTR] 3.1.168.192.in-addr.arpa from 127.0.0.1
Feb 20 23:29:22 dnsmasq[635]: config error is REFUSED

after a restart?


@lrkrol @jathor007 @sivuda @jmlingeman Could you also try adding the DEBUG_ALL option as suggested in https://github.com/pi-hole/FTL/issues/1030#issuecomment-763422681 ?

In addition to

tail -n 100 /var/log/pihole-FTL.log

, the dnsmasq log in

tail -n 100 /var/log/pihole.log

may be useful as well.

lrkrol commented 3 years ago

@DL6ER No, those lines do not appear to show up in logs during regular operation (after restarting the resolver). I have attached a truncated version of the FTL log with DEBUG_ALL below, following a reboot of the Raspberry Pi.

pihole-FTL.log

DL6ER commented 3 years ago

@lrkrol I do see them in FTL's log, so they should also be in pihole.log:

[2021-02-21 13:42:16.284 647M] new UDP query[PTR] query ".in-addr.arpa" from lo:127.0.0.1 (ID 9, FTL 3101, /root/project/src/dnsmasq/forward.c:1633) [...] [2021-02-21 13:42:16.285 647M] got reply error is REFUSED (nowhere to forward to) (ID 9, /root/project/src/dnsmasq/rfc1035.c:992)

Mind the nowhere to forward to - Pi-hole doesn't have an upstream destination where it could send this to. Your Pi-hole seems to be able to get host names for 10.6.0.0/24 (from /etc/pivpn/hosts.wireguard) but not for anything else, like 192.168.1.0/24 or 208.67.220.220 (which is resolver2.opendns.com). Some 192.168.1.0/24 device names are indirectly derived from the network table but it seems your upstream DNS server is not reachable.

Can you also show some lines from /var/log/pihole.log so we can check if there is any reason in there?


@lrkrol

I have attached a truncated version of the FTL log with DEBUG_ALL below, following a reboot of the Raspberry Pi.

I am not sure what you mean exactly, this log file doesn't look like FTL had any issues and was restarted afterwards. The only restart happened at 13:42:04 to 13:42:13 which is likely not enough time to restart the entire system. If this is indeed the restart, then we need some more older lines as everything looks to have worked correctly before.

Until now your issue looks to be unreliability of the upstream server (you loose connectivity, etc.) leading Pi-hole to rejecting this server in the future. The pihole.log should allow further analysis.

lrkrol commented 3 years ago

@lrkrol I do see them in FTL's log, so they should also be in pihole.log:

Thanks for looking into all of this. Indeed everything was working correctly before: there were no issues, and no REFUSED entries in the log. The problem appears whenever the Raspberry is newly booted. The log I attached was from a time I initiated a system reboot from the Pi-hole admin GUI. This indeed took place at 13:42:04 and onwards. It's a clean install of a new Raspberry, but yeah, the Pi-hole GUI sure isn't up in 10 seconds, nor can I ssh into the Raspberry that fast -- perhaps Pi-hole/FTL is up before the network connection is properly established, causing the issue?

I have now attached both pihole.log and pihole-FTL.log covering a period of:

pihole.log pihole-FTL.log

DL6ER commented 3 years ago

perhaps Pi-hole/FTL is up before the network connection is properly established, causing the issue?

Yes, looking at the logs, this would perfectly explain what is happening. Two things to validate this:

  1. Please try
    pihole checkout ftl tweak/listener_logging`

    restart the system and then return to usual operation by restarting from the GUI. Then post the output of

    grep "listener" /var/log/pihole-FTL.log
  2. Check what files you have in /etc/dnsmasq.d - there may be something in there that does not come from Pi-hole and which may be causing issues. Pi-hole usually binds the wildcard address, even when configured to listen on only some interfaces. We then discards requests that it shouldn't reply to. This has the advantage of working even when interfaces come and go and change address. However, this can be modified with some settings possibly causing what you're seeing.
  3. Try adding DELAY_STARTUP=5 to /etc/pihole/pihole-FTL.conf and restart once again. This will cause FTL to, well, delay its startup by 5 seconds. This may suffice to bring all interfaces up. You may want to play with this setting if 5 is still too small.

I'm pretty sure we'll find the cause (or, at least, disprove your interface-not-ready assumption)

DL6ER commented 3 years ago

@lrkrol Do you have any update for us?

lrkrol commented 3 years ago

@DL6ER Apologies; for the past weeks I've not been in a situation where I could simply restart my internets just to test something. Now I am, but there's a larger issue where the ISP requires me to reconfigure my entire login credentials whenever I do a restart. This is still on my list however --- I hope to get to it ASAP.

DL6ER commented 3 years ago

Closing due to inactivity. Feel free to reopen if the issue persists.