home-assistant / supervisor

:house_with_garden: Home Assistant Supervisor
https://home-assistant.io/hassio/
Apache License 2.0
1.8k stars 653 forks source link

supervisor 2024.03 breaks HA core 2023.11.2 #4961

Closed szymucha94 closed 7 months ago

szymucha94 commented 8 months ago

Describe the issue you are experiencing

Using HAOS. I've updated supervisor 2024.02.1 to 2024.03 and immediately noticed:

  1. Broken OpenWeatherMap integration (connections to provider's API failed)
  2. Broken MQTT broker (core-mosquitto, failed to connect to mqtt server due to exception errno -3 try again), basically all z2m devices unavailable
  3. Broken, unresponsive HACS Now I'll have to waste multiple hours on manually restoring state from 1h ago because backup HA doesn't accept docker backups from newer supervisor. Since supervisor updates are basically forced down user's throat by automatic operation could you please be more careful with testing this? Reverting image immediately fixed all issues - hacs works, z2m (mqtt) works, OWM immediately got fresh updates.

Interesting errors from ha core logs: 2024-03-14 14:30:26.953 ERROR (MainThread) [homeassistant.components.mqtt.client] Failed to connect to MQTT server due to exception: [Errno -3] Try again custom_components.hacs.exceptions.HacsException: Request exception for 'https://api.github.com/rate_limit' with - Cannot connect to host api.github.com:443 ssl:default [Try again]

Mosquitto logs: s6-rc: info: service s6rc-oneshot-runner: starting s6-rc: info: service s6rc-oneshot-runner successfully started s6-rc: info: service fix-attrs: starting s6-rc: info: service fix-attrs successfully started s6-rc: info: service legacy-cont-init: starting cont-init: info: running /etc/cont-init.d/mosquitto.sh [14:29:54] INFO: Setting up user homeassistant [14:29:54] INFO: SSL is not enabled cont-init: info: /etc/cont-init.d/mosquitto.sh exited 0 cont-init: info: running /etc/cont-init.d/nginx.sh cont-init: info: /etc/cont-init.d/nginx.sh exited 0 s6-rc: info: service legacy-cont-init successfully started s6-rc: info: service legacy-services: starting services-up: info: copying legacy longrun mosquitto (no readiness notification) services-up: info: copying legacy longrun nginx (no readiness notification) [14:29:55] INFO: Starting NGINX for authentication handling... s6-rc: info: service legacy-services successfully started [14:29:56] INFO: Starting mosquitto MQTT broker... 2024-03-14 14:29:56: Warning: Mosquitto should not be run as root/administrator. 2024-03-14 14:29:56: mosquitto version 2.0.18 starting 2024-03-14 14:29:56: Config loaded from /etc/mosquitto/mosquitto.conf. 2024-03-14 14:29:56: Loading plugin: /usr/share/mosquitto/go-auth.so 2024-03-14 14:29:56: ├── Username/password checking enabled. 2024-03-14 14:29:56: ├── TLS-PSK checking enabled. 2024-03-14 14:29:56: └── Extended authentication not enabled. 2024-03-14 14:29:56: Opening ipv4 listen socket on port 1883. 2024-03-14 14:29:56: Opening ipv6 listen socket on port 1883. 2024-03-14 14:29:56: Opening websockets listen socket on port 1884. 2024-03-14 14:29:56: mosquitto version 2.0.18 running 2024-03-14 14:29:56: New connection from 127.0.0.1:37386 on port 1883. 2024-03-14 14:29:56: Client disconnected due to protocol error. [14:29:57] INFO: Successfully send discovery information to Home Assistant. [14:29:57] INFO: Successfully send service information to the Supervisor. 2024-03-14 14:33:35: New connection from 172.30.32.2:35652 on port 1883. 2024-03-14 14:33:35: Client closed its connection. 2024-03-14 14:35:35: New connection from 172.30.32.2:50242 on port 1883. 2024-03-14 14:35:35: Client closed its connection. 2024-03-14 14:37:35: New connection from 172.30.32.2:40822 on port 1883. 2024-03-14 14:37:35: Client closed its connection. 2024-03-14 14:39:36: New connection from 172.30.32.2:57576 on port 1883. 2024-03-14 14:39:36: Client closed its connection. 2024-03-14 14:41:36: New connection from 172.30.32.2:52884 on port 1883. 2024-03-14 14:41:36: Client closed its connection. 2024-03-14 14:43:36: New connection from 172.30.32.2:52164 on port 1883. 2024-03-14 14:43:36: Client closed its connection. 2024-03-14 14:45:36: New connection from 172.30.32.2:50350 on port 1883. 2024-03-14 14:45:36: Client closed its connection. 2024-03-14 14:47:36: New connection from 172.30.32.2:34186 on port 1883. 2024-03-14 14:47:36: Client closed its connection. 2024-03-14 14:49:36: New connection from 172.30.32.2:44214 on port 1883. 2024-03-14 14:49:36: Client closed its connection. 2024-03-14 14:51:36: New connection from 172.30.32.2:35810 on port 1883. 2024-03-14 14:51:36: Client closed its connection.

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Steps to reproduce the issue

Update from versions stated above and see.

Anything in the Supervisor logs that might be useful for us?

This is from copy of HA image in isolated network without internet:
24-03-14 14:31:00 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-2850' coro=<Addon.watchdog_container() done, defined at /usr/src/supervisor/supervisor/addons/addon.py:1387> exception=AddonsJobError('Rate limit exceeded, more than 10 calls in 0:30:00')>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/addons/addon.py", line 1401, in watchdog_container
    await self._restart_after_problem(event.state)
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 290, in wrapper
    raise on_condition(
supervisor.exceptions.AddonsJobError: Rate limit exceeded, more than 10 calls in 0:30:00

System Health information

No point in copy-pasting system health info as there is nothing interesting there, also there is lots of spam due to working on separate VM without specific HA hardware/lack of hundreds of restapi-created sensors and more.

Supervisor diagnostics

No response

Additional information

No response

Skuair commented 8 months ago

Could be related? https://github.com/home-assistant/core/issues/113481

agners commented 8 months ago

@szymucha94 we don't have other similar reports which point to Supervisor, are you sure this was a Supervisor problem?

Are you using the AdGuard add-on?

szymucha94 commented 8 months ago

Yes, it's a supervisor issue. Updating to 2024.03 causes issue while staying on 2024.02.1 works perfectly fine. No, I don't use AdGuard addon.

agners commented 8 months ago

In general, 2024.03.0 has been on stable for quite a while, and we don't really have issue reports similar to this. I have it running on multiple system without issues. Also, form the changelog of 2024.03.0 I can't really see what would influence the system causing such issues.

Can you share the Supervisor log with 2024.02.1 and 2024.03.0, this should give more insights on what is the difference between the two versions.

github-actions[bot] commented 7 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

szymucha94 commented 7 months ago

I already gave up on future HA updates due to constant breaking of legacy. Because of this you can ignore this report.