home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
70.53k stars 29.45k forks source link

ZeroConf fails to start on reboot, requires HA restart to clear error #76627

Closed randomhammer closed 1 year ago

randomhammer commented 2 years ago

The problem

Whenever I reboot the server HA Core is running on, ZeroConf and all dependent services fail to restart. The root error seems to be this:

2022-08-11 14:57:49.938 WARNING (MainThread) [homeassistant.components.network] Because the system does not have any enabled IPv4 addresses, source address detection may be inaccurate
2022-08-11 14:57:50.467 ERROR (MainThread) [homeassistant.setup] Error during setup of component zeroconf
Traceback (most recent call last):
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/setup.py", line 235, in _async_setup_component
    result = await task
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/zeroconf/__init__.py", line 188, in async_setup
    aio_zc = await _async_get_instance(hass, **zc_args)
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/zeroconf/__init__.py", line 129, in _async_get_instance
    zeroconf = HaZeroconf(**zcargs)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_core.py", line 450, in __init__
    listen_socket, respond_sockets = create_sockets(interfaces, unicast, ip_version, apple_p2p=apple_p2p)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_utils/net.py", line 366, in create_sockets
    add_multicast_member(cast(socket.socket, listen_socket), i)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_utils/net.py", line 288, in add_multicast_member
    listen_socket.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, _value)
OSError: [Errno 19] No such device

I looked at one other solution, the one for putting a default_interface in #35281 but having a bit of trouble following the advice since there is no example configuration.yaml in that issue.

After issuing a restart from the developer tools everything works fine. There is also an error about no interfaces being available

Is systemd starting this before the network is ready?

What version of Home Assistant Core has the issue?

core-2022.8.2

What was the last working version of Home Assistant Core?

core-2022.7.5

What type of installation are you running?

Home Assistant Core

Integration causing the issue

ZeroConf Component

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2022-08-11 15:07:50.328 ERROR (MainThread) [frontend.js.latest.202208020] :0:0 Script error.
root@tesla:/home/hass/.homeassistant# cat home-assistant.log.1
2022-08-11 14:57:49.938 WARNING (MainThread) [homeassistant.components.network] Because the system does not have any enabled IPv4 addresses, source address detection may be inaccurate
2022-08-11 14:57:50.467 ERROR (MainThread) [homeassistant.setup] Error during setup of component zeroconf
Traceback (most recent call last):
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/setup.py", line 235, in _async_setup_component
    result = await task
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/zeroconf/__init__.py", line 188, in async_setup
    aio_zc = await _async_get_instance(hass, **zc_args)
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/zeroconf/__init__.py", line 129, in _async_get_instance
    zeroconf = HaZeroconf(**zcargs)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_core.py", line 450, in __init__
    listen_socket, respond_sockets = create_sockets(interfaces, unicast, ip_version, apple_p2p=apple_p2p)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_utils/net.py", line 366, in create_sockets
    add_multicast_member(cast(socket.socket, listen_socket), i)
  File "/opt/homeassistant/lib/python3.10/site-packages/zeroconf/_utils/net.py", line 288, in add_multicast_member
    listen_socket.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, _value)
OSError: [Errno 19] No such device
2022-08-11 14:57:50.858 ERROR (MainThread) [homeassistant.setup] Unable to set up dependencies of homekit_controller. Setup failed for dependencies: zeroconf
2022-08-11 14:57:50.859 ERROR (MainThread) [homeassistant.setup] Setup failed for homekit_controller: (DependencyError(...), 'Could not setup dependencies: zeroconf')
2022-08-11 14:57:50.860 ERROR (MainThread) [homeassistant.setup] Unable to set up dependencies of apple_tv. Setup failed for dependencies: zeroconf
2022-08-11 14:57:50.866 ERROR (MainThread) [homeassistant.setup] Setup failed for apple_tv: (DependencyError(...), 'Could not setup dependencies: zeroconf')
2022-08-11 14:57:51.994 WARNING (MainThread) [homeassistant.config_entries] Config entry 'AirNow AQI' for airnow integration not ready yet: Cannot connect to host www.airnowapi.org:80 ssl:default [Temporary failure in name resolution]; Retrying in background
2022-08-11 14:57:52.313 WARNING (MainThread) [homeassistant.config_entries] Config entry 'OpenWeatherMap' for openweathermap integration not ready yet: HTTPSConnectionPool(host='api.openweathermap.org', port=443): Max retries exceeded with url: /data/2.5/onecall?APPID=416e514f8e437c17402982ff081b772f&lang=en&lon=-90.058&lat=30.3907 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f11a9cabbe0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')); Retrying in background
2022-08-11 14:57:52.334 ERROR (MainThread) [homeassistant.components.ambient_station] Config entry failed: Connection error
2022-08-11 14:57:52.338 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Weather Station' for ambient_station integration not ready yet: Connection error; Retrying in background
2022-08-11 14:57:52.359 ERROR (MainThread) [homeassistant] Error doing job: Unclosed client session
2022-08-11 14:57:52.447 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Yamaha Receiver' for yamaha_musiccast integration not ready yet; Retrying in background
2022-08-11 14:57:52.543 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Hue Bridge' for hue integration not ready yet: Error connecting to the Hue bridge at 192.168.1.8; Retrying in background
2022-08-11 14:57:52.547 WARNING (MainThread) [pylutron_caseta.smartbridge] Reconnecting...
Traceback (most recent call last):
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/smartbridge.py", line 437, in _monitor_once
    self._leap = await self._connect()
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/smartbridge.py", line 115, in _connect
    res = await open_connection(
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/leap.py", line 178, in open_connection
    reader, writer = await asyncio.open_connection(host, port, limit=limit, **kwds)
  File "/opt/python3.10/lib/python3.10/asyncio/streams.py", line 47, in open_connection
    transport, _ = await loop.create_connection(
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 1064, in create_connection
    raise exceptions[0]
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 1049, in create_connection
    sock = await self._connect_sock(
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 960, in _connect_sock
    await self.sock_connect(sock, address)
  File "/opt/python3.10/lib/python3.10/asyncio/selector_events.py", line 500, in sock_connect
    return await fut
  File "/opt/python3.10/lib/python3.10/asyncio/selector_events.py", line 505, in _sock_connect
    sock.connect(address)
OSError: [Errno 101] Network is unreachable
2022-08-11 14:57:52.558 ERROR (MainThread) [metno] Access to https://aa015h6buqvih86i1.api.met.no/weatherapi/locationforecast/2.0/complete returned error 'ClientConnectorError'
2022-08-11 14:57:52.559 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Weather' for met integration not ready yet: Update failed: ; Retrying in background
2022-08-11 14:57:52.811 WARNING (MainThread) [homeassistant.config_entries] Config entry 'OpenMeteo Weather' for open_meteo integration not ready yet: Open-Meteo API communication error; Retrying in background
2022-08-11 14:57:52.837 WARNING (MainThread) [homeassistant.components.network] Because the system does not have any enabled IPv4 addresses, source address detection may be inaccurate
2022-08-11 14:57:52.838 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry HASS Bridge:21064 for homekit
Traceback (most recent call last):
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/config_entries.py", line 357, in async_setup
    result = await component.async_setup_entry(hass, self)
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/homekit/__init__.py", line 285, in async_setup_entry
    CONF_IP_ADDRESS, await network.async_get_source_ip(hass, MDNS_TARGET_IP)
  File "/opt/homeassistant/lib/python3.10/site-packages/homeassistant/components/network/__init__.py", line 57, in async_get_source_ip
    raise HomeAssistantError(
homeassistant.exceptions.HomeAssistantError: Could not determine source ip because the system does not have any enabled IPv4 addresses and creating a socket failed
2022-08-11 14:57:52.860 ERROR (MainThread) [homeassistant.setup] Unable to set up dependencies of default_config. Setup failed for dependencies: zeroconf
2022-08-11 14:57:52.860 ERROR (MainThread) [homeassistant.setup] Setup failed for default_config: (DependencyError(...), 'Could not setup dependencies: zeroconf')
2022-08-11 14:57:54.550 WARNING (MainThread) [pylutron_caseta.smartbridge] Reconnecting...
Traceback (most recent call last):
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/smartbridge.py", line 437, in _monitor_once
    self._leap = await self._connect()
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/smartbridge.py", line 115, in _connect
    res = await open_connection(
  File "/opt/homeassistant/lib/python3.10/site-packages/pylutron_caseta/leap.py", line 178, in open_connection
    reader, writer = await asyncio.open_connection(host, port, limit=limit, **kwds)
  File "/opt/python3.10/lib/python3.10/asyncio/streams.py", line 47, in open_connection
    transport, _ = await loop.create_connection(
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 1064, in create_connection
    raise exceptions[0]
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 1049, in create_connection
    sock = await self._connect_sock(
  File "/opt/python3.10/lib/python3.10/asyncio/base_events.py", line 960, in _connect_sock
    await self.sock_connect(sock, address)
  File "/opt/python3.10/lib/python3.10/asyncio/selector_events.py", line 500, in sock_connect
    return await fut
  File "/opt/python3.10/lib/python3.10/asyncio/selector_events.py", line 505, in _sock_connect
    sock.connect(address)
OSError: [Errno 101] Network is unreachable
2022-08-11 14:57:59.328 WARNING (MainThread) [homeassistant.components.zha.core.channels.base] [0xBE5A:1:0x0006]: async_initialize: all attempts have failed: [DeliveryError('[0xbe5a:1:0x0006]: Message send failure'), DeliveryError('[0xbe5a:1:0x0006]: Message send failure'), DeliveryError('[0xbe5a:1:0x0006]: Message send failure'), DeliveryError('[0xbe5a:1:0x0006]: Message send failure')]
2022-08-11 14:58:02.339 ERROR (MainThread) [homeassistant] Error doing job: Unclosed client session

Additional information

I am running on a fast machine with a lot of resources (32GB RAM, 8 CPU cores) and Debian Bullseye. Here is the unit file that I'm using to start the service:

[Unit]
Description=Home Assistant
Wants=avahi-daemon.service network-online.target multi-user.target
After=network-online.target avahi-daemon.service mariadb.service
StartLimitInterval=0

[Service]
Type=simple
User=hass
WorkingDirectory=/home/hass/.homeassistant
ExecStart=/opt/homeassistant/bin/hass -c "/home/hass/.homeassistant"
Restart=on-failure
RestartSec=5
RestartForceExitStatus=100

[Install]
WantedBy=multi-user.target
Alias=hass.service
randomhammer commented 2 years ago

Also wanted to note, no Docker anywhere on this machine. Only running HA Core in a virtualenv

probot-home-assistant[bot] commented 2 years ago

zeroconf documentation zeroconf source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @bdraco, mind taking a look at this issue as it has been labeled with an integration (zeroconf) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-home-assistant[bot] commented 2 years ago

network documentation network source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @home-assistant/core, mind taking a look at this issue as it has been labeled with an integration (network) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-home-assistant[bot] commented 2 years ago

lutron_caseta documentation lutron_caseta source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @swails, mind taking a look at this issue as it has been labeled with an integration (lutron_caseta) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

randomhammer commented 2 years ago

It looks like probot-home-assistant may have spammed this out to a bunch of different folks based on my log entries. The root issue seems to be that HomeAssistant does not see an IP address when the machine reboots. I think that is why ZeroConf fails, and ZeroConf failing is why the other integrations fail. Does anyone have any idea why there is no IPv4 address when Home Assistant starts?

randomhammer commented 1 year ago

Nobody else has this issue? It is 100% repeatable for me

bdraco commented 1 year ago

It looks like Home Assistant is being started before the system has an IP address. Try setting a static one

2022-08-11 14:57:49.938 WARNING (MainThread) [homeassistant.components.network] Because the system does not have any enabled IPv4 addresses, source address detection may be inaccurate

dust0vich commented 1 year ago

I have this exact same issue. Haven't found a solution yet. I tried changing from network.target to network-online.target in the unit file. No success.

dust0vich commented 1 year ago

https://wiki.archlinux.org/title/Systemd#Running_services_after_the_network_is_up

Check here, make sure that the proper wait service is enabled. I also added the nss-lookup.target to mine and now it works just fine. Also type=idle would make it wait until all the other services are complete before loading, might also help. I am using DHCP and it works now.

github-actions[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment šŸ‘ This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

randomhammer commented 1 year ago

For anyone who finds this in the future, there was only one way to fix the problem. What happens on reboot is that systemd sucks waiting to start services until the network is running, even with an appropriate network monitor. My solution was to simply delay the start of Home Assistant for 15 seconds by modifying the systemd unit file. This is brute force, but works all of the time. 15 seconds sufficient for the interfaces on my server to become available, and when they do ZeroConf starts just fine. The line I added to the unit is this:

ExecStartPre=/bin/sleep 15

Note that this means that restarting the unit always waits 15 seconds, but at least when my server reboots I don't need to restart HA again to clear the ZeroConf error.