Closed flyingstar16 closed 6 months ago
just to exclude any issues with some custom components, please restart ha in safe mode and check again
Done, twice (to make sure) and still happening.
select datetime(timestamp, 'unixepoch', 'localtime') as dt, count(*) from queries where client = '<my HAOS IP>' and type = 6 and timestamp > unixepoch('2024-04-14 15:00:00', '-02:00') group by timestamp order by timestamp asc;"
2024-04-14 15:08:43|175
2024-04-14 15:08:46|79
2024-04-14 15:23:18|183
2024-04-14 15:23:21|71
2024-04-14 15:28:15|254
2024-04-14 15:28:18|3
Also ha core logs
shows three integrations failing to setup with "integration not found" (which is expected).
ok, just to exclude miss-understandings
to try to start narrow down this problem, please provide the system information from Settings > System > Repairs > 3-dot-menu top right > System information (use the copy button at the bottom and paste it "as is" here)
No problem :) And thanks!
* you're running Home Assistant OS?
Yes, HAOS 12.2
* the problem occur with and without being in safe mode?
Correct
* the problem also occur when only HA core (_not HA OS nor any AddOn_) is restarted?
I believe so: I restarted in safe mode from
Developer Tools > Restart > Restart Home Assistant in safe mode
and if I understand correctly that only restarts Core, nothing else.* the problem occur exactly hourly?
Pretty much, yes: it seems to be sligthly less than one hour, but other than that it's every hour from the last restart. I restarted twice in safe mode at 15:23 and 15:28 (when you asked); at 15:45 I restarted in safe mode again (logs show "integration not found") and then at 15:47 I restarted in "normal "mode.
An hour later (16:47) it happened again.
2024-04-14 15:23:18|183
2024-04-14 15:23:21|71
2024-04-14 15:28:15|254
2024-04-14 15:28:18|3
2024-04-14 15:45:02|214
2024-04-14 15:45:05|40
2024-04-14 15:47:14|206
2024-04-14 15:47:17|48
2024-04-14 16:47:14|198
2024-04-14 16:47:17|56
to try to start narrow down this problem, please provide the system information from Settings > System > Repairs > 3-dot-menu top right > System information (use the copy button at the bottom and paste it "as is" here)
Here you go
version | core-2024.4.3 |
---|---|
installation_type | Home Assistant OS |
dev | false |
hassio | true |
docker | true |
user | root |
virtualenv | false |
python_version | 3.12.2 |
os_name | Linux |
os_version | 6.6.25-haos |
arch | x86_64 |
timezone | Europe/Zurich |
config_dir | /config |
mhhh ... ok, let's try to check the full log file, to do so, just restart HA wait until everything is loaded and the problem is seen again in your piHole, than download and provide the full log (Settings > System > Logs > "Download full log" button at the bottom)
This is the full log after the issue happened. Two notes:
The only line that is present in both the log from safe mode and the log from "normal mode" is mqtt's "The 'schema' option is deprecated, please remove it from your configuration".
2024-04-14 18:11:37.539 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration candy which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
2024-04-14 18:11:37.540 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration <redacted> which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
2024-04-14 18:11:37.540 WARNING (SyncWorker_3) [homeassistant.loader] We found a custom integration hacs which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
2024-04-14 18:11:38.637 WARNING (MainThread) [homeassistant.helpers.frame] Detected that custom integration 'hacs' accesses hass.components.frontend. This is deprecated and will stop working in Home Assistant 2024.9, it should be updated to import functions used from frontend directly at custom_components/hacs/frontend.py, line 68: hass.components.frontend.async_register_built_in_panel(, please create a bug report at https://github.com/hacs/integration/issues
2024-04-14 18:11:39.743 WARNING (MainThread) [homeassistant.components.mqtt] The 'schema' option is deprecated, please remove it from your configuration
2024-04-14 18:12:18.473 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/config_entries.py", line 1722, in async_reload
unload_result = await self.async_unload(entry_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/config_entries.py", line 1694, in async_unload
raise OperationNotAllowed(
homeassistant.config_entries.OperationNotAllowed: The config entry <redacted IP> (octoprint) with entry_id 0caaefcc97eaf33752e561ee64b265c1 cannot be unloaded because it is not in a recoverable state (ConfigEntryState.SETUP_IN_PROGRESS)
(the one time I use GH's search instead of Google...)
So this seems similar to https://github.com/home-assistant/core/issues/57378; there's also a massive thread on the forums about it: https://community.home-assistant.io/t/ha-spamming-ptr-dns-lookups/143687/79 and in the last message of the thread the finger was pointed at the fallback DNS, so...
[core-ssh /]$ ha dns info
fallback: true
host: 172.30.32.3
llmnr: true
locals:
- dns://<my Pi-Hole>
mdns: true
servers: []
update_available: false
version: 2024.04.0
version_latest: 2024.04.0
[core-ssh /]$ ha dns options --fallback=false
Command completed successfully.
[core-ssh /]$ ha dns info
fallback: false
host: 172.30.32.3
llmnr: true
locals:
- dns://<my Pi-Hole>
mdns: true
servers: []
update_available: false
version: 2024.04.0
version_latest: 2024.04.0
Let's see
Nevermind, didn't work :/
we need to increase the severity of your logger config to at least info
- to do you, add the following to your configuration.yaml
and restart HA, afterwards download and provide the full log again.
logger:
default: info
Further please test if disabling the ESPhome AddOn has any effect on the problem.
I see that log entry every hour on my VM where I run HAOS 2024.4.3 (yes, VM inside a VM). The log entry is always accompanied by a DHCPREQUEST entry (coming from HA). I'll try giving HA a static IP. EDIT: No such luck. The DHCPREQUEST log entry is gone but the "maximum number of concurrent DNS queries reached" still apppears.
@avandorp please provide same information as requested in https://github.com/home-assistant/core/issues/115570#issuecomment-2054082568, also the full log as mentioned in https://github.com/home-assistant/core/issues/115570#issuecomment-2054106066 with at least severity info as mentioned in https://github.com/home-assistant/core/issues/115570#issuecomment-2054153536
@avandorp please provide same information as requested in #115570 (comment), also the full log as mentioned in #115570 (comment) with at least severity info as mentioned in #115570 (comment)
Core log doesn't show anything interesting, will set logging to "default: info" and report back. I find this line from the multicast log interesting: mdns-repeater (6): dev hassio addr 172.30.32.1 mask 255.255.254.0 net 172.30.32.0. That would be a 510 hosts sized private network if I'm not mistaken. But: homeassistant has an "external" ip address in the 192.168.122.0/24 range (virtual network bridge from kvm/libvirt).
version | core-2024.4.3 |
---|---|
installation_type | Home Assistant OS |
dev | false |
hassio | true |
docker | true |
user | root |
virtualenv | false |
python_version | 3.12.2 |
os_name | Linux |
os_version | 6.6.25-haos |
arch | x86_64 |
timezone | Europe/Zurich |
config_dir | /config |
Here's the info logs surrounding the event after a Core restart. The PTR spam was recorded by Pi-Hole at 11:43:41. I'll share the full log if needed but only in private (not sure if there's an uploading service or I can email it directly to you) but I warn you there is nothing relevant in it
```
2024-04-15 11:43:40.610 INFO (MainThread) [homeassistant.bootstrap] Home Assistant initialized in 41.81s
2024-04-15 11:43:40.610 INFO (MainThread) [homeassistant.core] Starting Home Assistant
2024-04-15 11:43:40.612 INFO (MainThread) [custom_components.hacs] Stage changed: startup
2024-04-15 11:43:40.616 INFO (MainThread) [homeassistant.components.automation.turn_on_night_mode] Initialized trigger House Mode: Night
2024-04-15 11:43:40.617 INFO (MainThread) [homeassistant.components.automation.turn_off_night_mode] Initialized trigger House Mode: Day
2024-04-15 11:43:40.617 INFO (MainThread) [homeassistant.components.automation.kitchen_sink_light_motion_sensor] Initialized trigger Kitchen Sink Light - Motion Sensor
2024-04-15 11:43:40.617 INFO (MainThread) [homeassistant.components.automation.house_mode_sleep] Initialized trigger House Mode: Sleep
2024-04-15 11:43:40.617 INFO (MainThread) [homeassistant.components.automation.hallway_light_motion] Initialized trigger Hallway Light Motion
2024-04-15 11:43:40.617 INFO (MainThread) [homeassistant.components.automation.house_mode_dimmer_pressed] Initialized trigger House Mode: Button or Door
2024-04-15 11:43:40.618 INFO (MainThread) [homeassistant.components.automation.notify_via_telegram_when_the_washing_machine_is_done] Initialized trigger Notify via Telegram when the washing machine is done
2024-04-15 11:43:40.618 INFO (SyncWorker_6) [homeassistant.loader] Loaded kodi from homeassistant.components.kodi
2024-04-15 11:43:40.623 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/config_entries.py", line 1722, in async_reload
unload_result = await self.async_unload(entry_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/config_entries.py", line 1694, in async_unload
raise OperationNotAllowed(
homeassistant.config_entries.OperationNotAllowed: The config entry
Also because I'm pretty sure you were going to ask at some point I went ahead and did the same with the debug logging enabled. I don't really feel comfortable sharing it at all, but my readthrough shows me some "suspicious" things (event happened at 11:51:07 this time):
```
2024-04-15 11:56:06.134 DEBUG (MainThread) [async_upnp_client.server] Start advertisements announcer
[...]
2024-04-15 11:56:06.140 DEBUG (MainThread) [async_upnp_client.server] Announcing
2024-04-15 11:56:06.140 DEBUG (MainThread) [async_upnp_client.server] Sending advertisement, NTS: ssdp:alive, NT: upnp:rootdevice, USN: UUID:
homeassistant.components.recorder.core] Processing task
between 11:56:06.229
and 11:56:06.265
but most of them are Event state_changed
so I don't think they're relevant2024-04-15 11:56:08.121 DEBUG (SyncWorker_4) [paho.mqtt.client] Sending PUBLISH (d0, q0, r0, m3), 'b'homeassistant/status'', ... (6 bytes)
2024-04-15 11:56:08.121 DEBUG (MainThread) [homeassistant.components.mqtt.client] Transmitting message on homeassistant/status: 'online', mid: 3, qos: 0
2024-04-15 11:56:07.017 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -79)>}, []]
2024-04-15 11:56:07.029 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.029 DEBUG (MainThread) [aiogithubapi] 'GitHubRateLimitResourcesModel' is missing key 'dependency_snapshots' for <class 'dict'>
2024-04-15 11:56:07.029 DEBUG (MainThread) [aiogithubapi] 'GitHubRateLimitResourcesModel' is missing key 'audit_log' for <class 'dict'>
2024-04-15 11:56:07.029 DEBUG (MainThread) [aiogithubapi] 'GitHubRateLimitResourcesModel' is missing key 'code_search' for <class 'dict'>
2024-04-15 11:56:07.029 DEBUG (MainThread) [custom_components.hacs] Can update 396 repositories, items in queue 4
2024-04-15 11:56:07.029 DEBUG (MainThread) [custom_components.hacs] <QueueManager> Checking out tasks to execute
2024-04-15 11:56:07.029 DEBUG (MainThread) [custom_components.hacs] <QueueManager> Starting queue execution for 4 tasks
2024-04-15 11:56:07.029 DEBUG (MainThread) [custom_components.hacs] <Plugin iantrich/config-template-card> Getting repository information
2024-04-15 11:56:07.030 DEBUG (MainThread) [custom_components.hacs] <Integration hacs/integration> Getting repository information
2024-04-15 11:56:07.031 DEBUG (MainThread) [custom_components.hacs] <Integration ofalvai/home-assistant-candy> Getting repository information
2024-04-15 11:56:07.031 DEBUG (MainThread) [custom_components.hacs] <Plugin thomasloven/lovelace-layout-card> Getting repository information
2024-04-15 11:56:07.032 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -71)>}, []]
2024-04-15 11:56:07.082 DEBUG (MainThread) [aioesphomeapi.connection] nodemcu-bt-bedroom @ <redacted>: Got message of type BluetoothLERawAdvertisementsResponse: advertisements {
address: <redacted>
rssi: -69
data: "<redacted>"
}
2024-04-15 11:56:07.294 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -73)>}, []]
2024-04-15 11:56:07.324 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -78)>}, []]
2024-04-15 11:56:07.378 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -69)>}, []]
2024-04-15 11:56:07.388 DEBUG (MainThread) [aioesphomeapi.connection] nodemcu-bt-bedroom @ <redacted>: Got message of type BluetoothLERawAdvertisementsResponse: advertisements {
address: <redacted>
rssi: -77
data: "<redacted>"
}
2024-04-15 11:56:07.466 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.467 DEBUG (MainThread) [custom_components.hacs] <Integration ofalvai/home-assistant-candy> Running checks against 0.8.2
2024-04-15 11:56:07.507 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.509 DEBUG (MainThread) [custom_components.hacs] <Plugin thomasloven/lovelace-layout-card> Running checks against v2.4.5
2024-04-15 11:56:07.555 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -70)>}, []]
2024-04-15 11:56:07.595 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.596 DEBUG (MainThread) [custom_components.hacs] <Plugin iantrich/config-template-card> Running checks against 1.3.6
2024-04-15 11:56:07.630 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -79)>}, []]
2024-04-15 11:56:07.761 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -80)>}, []]
2024-04-15 11:56:07.804 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.806 DEBUG (MainThread) [custom_components.hacs] <Integration hacs/integration> Running checks against 1.34.0
2024-04-15 11:56:07.815 DEBUG (MainThread) [bleak.backends.bluezdbus.manager] received D-Bus signal: org.freedesktop.DBus.Properties.PropertiesChanged (/org/bluez/hci0/<redacted>): ['org.bluez.Device1', {'RSSI': <dbus_fast.signature.Variant ('n', -72)>}, []]
2024-04-15 11:56:07.898 DEBUG (MainThread) [aioesphomeapi.connection] nodemcu-kitchen @ <redacted>: Got message of type SensorStateResponse: key: <redacted>
state: 25.1
2024-04-15 11:56:07.910 DEBUG (MainThread) [aioesphomeapi.connection] nodemcu-kitchen @ <redacted>: Got message of type SensorStateResponse: key: <redacted>
state: 32
2024-04-15 11:56:07.970 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.971 DEBUG (MainThread) [custom_components.hacs] <Integration ofalvai/home-assistant-candy> Getting documentation for version=0.8.2,filename=README.md
2024-04-15 11:56:07.971 DEBUG (MainThread) [custom_components.hacs] Trying to download https://raw.githubusercontent.com/ofalvai/home-assistant-candy/0.8.2/README.md
2024-04-15 11:56:07.981 DEBUG (MainThread) [aiogithubapi] 'GitHubResponseHeadersModel' is missing key 'x_github_api_version_selected' for <class 'str'>
2024-04-15 11:56:07.981 DEBUG (MainThread) [custom_components.hacs] <Plugin thomasloven/lovelace-layout-card> Getting documentation for version=v2.4.5,filename=README.md
2024-04-15 11:56:07.981 DEBUG (MainThread) [custom_components.hacs] Trying to download https://raw.githubusercontent.com/thomasloven/lovelace-layout-card/v2.4.5/README.md
2024-04-15 11:56:12.124
with DEBUG (MainThread) [homeassistant.helpers.http] Serving /api/error_log to 192.168.44.11 (auth: True)
@bdraco sorry for the tag, I am wondering if this is some form of regression from #57378 that you worked on a few years ago? It's only scanning the /24 where HAOS resides but it also only started happening on 2024-03-29
@avandorp the 172.30.32.1
is from docker's networking, I have the same 172.30.32.0/23
for docker but my HAOS is on 192.168.x.x/24
. (/23
, a.k.a. 255.255.254.0
, is a 512 IP subnet, yes)
It's normal and expected for discovery to do that, and it always has. What that issue fixed was to turn off scanning for very large networks.
It's probably quite a bit faster now though in 2024.4.x as there were some bottlenecks fixed so you see the queries grouped much closer together.
We could probably add a rate limit to reduce the number of queries per second, but need to think about that one as it will make discovery slower for everybody.
It's normal and expected for discovery to do that, and it always has.
Is it possible to disable discovery? It won't find anything useful ever for my setup. In the meantime I've simply disabled dns on the virtual network bridge.
It's not currently configurable.
You can remove default_config:
from your configuration.yaml and manually add all the components except dhcp:
but it's also going break all the other dhcp discovery methods including any integration that is relying on it working to get ip updates.
It won't find anything useful ever for my setup.
Keep in mind that many integrations use discovery to see IP updates and automatically update their config entries.
If you have assigned all static IPs to your devices and do not use DHCP, than it probably is not so useful for your setup and removing dhcp probably won't matter.
Rate limit turned out to not be that bad on slowing things down so I added that in https://github.com/home-assistant/core/pull/115823. (64 concurrent max which should be well under the 150 limit)
That should get it closer to how it worked before 2024.4.x
Thank you very much!
Change is targeted for 2024.5.x
Apologies if I should open a separate issue for this or am otherwise failing to follow github decorum, but I may have a related issue to this one. The pushed change/fix has mitigated my issue, but not fully fixed it if I'm understanding my logs correctly.
I was having the same Maximum number of concurrent DNS queries
problem and noticed the hourly PTR request 'spam'. It appears that burst of requests is now stretched over the course of a minute or two.
The requests span all of /24 subnet as expected, but in my case appear to do so 5-10x each time (once per hour or on HAOS reboot), rather than only once. I can't tell if HA is dissatisfied with the reply ("N/A" or empty for the majority of IPs checked) from the PiHole and router, or if there's some other root cause. It does not appear the change pushed in 2024.5 fully resolved my particular version of the issue; it did however make it a little easier for the PiHole to handle.
Appreciate any and all advice, thanks.
I'm just guessing here, but is there any chance you have 4 DNS servers configured as upstream, and IPv6 enabled on your Home Assistant machine?
Maybe Pi-Hole is requesting A and AAAA to each server when it doesn't get a reply, making ~8 requests every time?
FWIW you can check exactly how many requests it's doing by querying pihole-FTL.db
with SQLite (if you're comfortable doing that)
At least in the HA Network settings UI, I only have one IPv4 DNS server configured, 10.0.0.1, which is also the gateway address. My router is the DHCP and first DNS (for its caching), and then forwards cache misses to the PiHole as its upstream. I imagine the PiHole then - unfortunately - ends up asking the router for information on hostnames that I don't have statically set in Local DNS on the PiHole (I have Conditional Forwarding set). The PiHole is set to never forward non-FQDN or reverse lookups for private IP ranges. I have 3 WAN upstreams on the PiHole.
I do not have IPv6 enabled on the router, although I do notice I have IPv6 left on Automatic
in HA Network settings. I'll see if behavior changes when I disable it in HA, given my LAN doesn't support it. (For now at least; I do need to configure it soon to get Matter+Thread onboarding to work right).
It's ~2500 requests each burst/group each hour, with probably less than 30 coming back with a result. So give or take, assuming it's not treating the PiHole as a unique IPv6 DNS upstream, 10-ish retries per IP. 5-ish if v4 and v6 are separate.
--
Just rebooted HA to trigger a PTR spam-wave with IPv6 disabled this time. The behavior looks the same including ~2200 requests or so, so let me try to give some more details as to how each set of requests looks in the queries log.
The PTR requests appear to start in sequence, 10.0.0.2 - .64 rate-limit-batch inside of 3 seconds. 10 or so come back as OK
, DOMAIN
- I believe they provide hostnames. All else are marked as Retried
, N/A
for Status and Reply. I see what I assume are the retries come flooding in while this first batch is still being processed, all marked OK
,N/A
. Another 2 seconds or so and the retries and first batch of 64 are all done, marked OK
, N/A
.
I imagine it's retrying because it expects an answer? Then the actual problem occurs: This batch of 64 appears to repeat OK
, N/A
at least 4-5x, but I can't tell if it's HA asking over and over, or PiHole retrying over and over without saying it's a Retry.
The above behavior repeats, traversing through the subnet, with some variation to when 'fresh' calls (I assume) are made marked as Retried
vs. OK
,N/A
. Total is between 2000-2500, leaning towards 2500 each time.
Perhaps getting off topic for this github issue, but I am curious if I need LAN domain/hostname resolution at all in HA, so if there's a way to disable solely that portion of its regular network perusal, I'm all ears.
Edit for more important info I missed: All the requests are formatted as x.0.0.10.in-addr.arpa
. I'm unsure if that's standard for reverse-DNS requests, so adding it here.
Edit2: I can bypass the issue by disabling Conditional Forwarding in PiHole so it replies with NXDOMAIN
instead, but this seems like a band-aid fix for something I only notice with HomeAssistant.
Edit3: It appears the issue might actually be the Conditional Forwarding creating a DNS loop for unknown domains, and it's only prevalent with HA because of how many requests it makes, and how often. In my network layout, if the router doesn't know, it can ask PiHole. If PiHole doesn't know, it shouldn't Forward, it should reply nxdomain. User error wins again, sorry folks.
Hmm ok, I suspect this is not really related to this issue, and I would encourage you to open a separate one.
Before you do that though, you should check who is actually causing the PTR spam; there is a possibility that this is your router/DNS forwarder retrying queries that failed[^1].
The best way to check this IMO would be to sniff the traffic coming out of HA during a restart and see if it sends 254 requests or ~2500.
If you can't easily do that, maybe you could also change your HA settings to point directly to your Pi-Hole and see how many queries are recorded during restarts (do a couple, just to be sure the setting took hold?).
I'd also be curious to understand how the forwarder tells Pi-Hole that it's HA doing those requests (to the best of my knowledge, barring some magic routing/ARP on the router side, it should not be able to impersonate a device that already has another IP address) or do all queries on the Pi-Hole show up as made by 10.0.0.1 (or equivalent)?.
A few more things to check:
Type
in Pi-Hole (A
vs AAAA
should tell you how many of the queries are unique)doesnotexists1337.example.org
) how many queries are recorded on Pi-Hole and of which Type
Perhaps getting off topic for this github issue, but I am curious if I need LAN domain/hostname resolution at all in HA, so if there's a way to disable solely that portion of its regular network perusal, I'm all ears.
There is, but it's hacky; see the thread above from this comment onwards.
Edit for more important info I missed: All the requests are formatted as x.0.0.10.in-addr.arpa. I'm unsure if that's standard for reverse-DNS requests, so adding it here.
That is expected and working as intended: x.0.0.10.in-addr.arpa
is the "hostname" that you ask the DNS about. Let's say you have server123.local
mapped to 10.0.0.123
; you know the IP, but don't remember the hostname. To find the hostname you ask the DNS server for the PTR
record of 123.0.0.10.in-addr.arpa
(that's 10.0.0.123
reversed) and if your upstream DNS server has reverse mappings (which it often doesn't, at least in home deployments) it'll return server123.local
.
Wikipedia explains it better than I can :)
Side note: when I asked about upstream I should've been clearer: I meant your Pi-Hole upstreams (http://pi.hole/admin/settings.php?tab=dns), and I wondered if Pi-Hole was retrying the query 4 times; if you don't have IPv6 then it's very unlikely for you to experience 8x/9x retries every time (~2000/2500 queries divided by 254 IPs), unless you have some strange settings and a lot of upstream DNS servers for Pi-Hole.
[^1]: For example, for every HA DNS PTR request that is made to 10.0.0.1 and fails the forwarder (10.0.0.1) retries it 10 times. You have at least 2 queries that likely succeed (HA itself and the gateway). Let's say your subnet has 10 active IPs including HA and the gateway, you would see in this case 10 OK
requests + (254-10)*10 N/A
= 10+2440 = 2450 total queries in Pi-Hole.
Only saw your edits after posting the comment!
Edit3: It appears the issue might actually be the Conditional Forwarding creating a DNS loop for unknown domains, and it's only prevalent with HA because of how many requests it makes, and how often. User error wins again, sorry folks.
Heh, it happens :) Glad you figured it out.
These things always makes sense AFTERWARDS, not while you're struggling to fix them, but IIUC from your sentence 10.0.0.1 is forwarding to Pi-Hole, which in turn (when Conditional Forwarding is enabled) forwards stuff it can't find to 10.0.0.1, which in turns sends them to Pi-Hole, [...]
I don't know what your network looks like, but you could consider using Pi-Hole directly as your DNS server; it does its own caching (it's kind of annoying sometimes, because when I change something on its upstream I need to restart it to make it pick up new stuff). The prerequisites are not that bad and you could try benchmarking before switching 🤷
The problem
I got a warning from Pi-Hole that about
Maximum number of concurrent DNS queries reached (max: 150)
(Pi-Hole docs) and this was caused at 04:19:33 by my HAOS VM.When I looked into it I found that starting 2024-03-29 (that's the earliest entry in Pi-Hole's database and I haven't wiped that) every hour my HAOS VM sends PTR queries for its entire /24 subnet (254 IPs). It also seems to happen on HA Core boot. I don't know how to check what happened on 2024-03-29:
name: addon_5c53de3b_esphome_2024.3.0
seems to imply I was on 2024.3.0 at the time)It looks like the queries are sent at the same exact moment (whether in parallel or in a very fast loop I don't know) and I don't know why this is happening, but I suspect that on larger subnets this may actually cause issues to DNS servers.
What version of Home Assistant Core has the issue?
core-2024.4.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
No response
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
Installed add-ons: ESPHome (2024.3.2), Terminal & SSH (9.13.0), Mosquitto broker (6.4.0). I checked their logs using
ha host logs --identifier <identifier>
(got the identifier fromha host logs identifiers
) and nothing out of the ordinary there (times don't match).I checked DHCP lease changes (
ha host logs -t NetworkManager
) and the times don't matchBelow is a query to Pi-Hole's database;
type=6
is PTR in Pi-HoleExpand for the full query results
``` sqlite> select datetime(timestamp, 'unixepoch', 'localtime') as dt, count(*) from queries where client = '