home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.98k stars 976 forks source link

Home Assistant OS crashes in Raspberry Pi 4 #3636

Closed tyron closed 1 week ago

tyron commented 1 week ago

Describe the issue you are experiencing

Recently I started seeing many more crashes on my Home Assistant installation running on a Raspberry Pi 4. It wasn't an issue until recent months, so I'd say it got worse around 2024.08 release.

From a hunch, it seems to happen more when I'm interacting with Visual Studio add-on.

Additional details:

What operating system image do you use?

rpi4-64 (Raspberry Pi 4/400 64-bit OS)

What version of Home Assistant Operating System is installed?

13.2

Did the problem occur after upgrading the Operating System?

No

Hardware details

Raspberry Pi 4 2GB RAM 256Gb SSD Increase swap addon (to add 4GB SWAP to my SSD https://github.com/TazzerMAN/increase_swap_addon)

Steps to reproduce the issue

  1. Use HA regularly
  2. Wait randomly until it breaks (not only when I interact with Visual Studio add-on, but very likely to happen when I interact with the add-on)

Anything in the Supervisor logs that might be useful for us?

No, logs rotate after the crash/reboot

Anything in the Host logs that might be useful for us?

No, logs rotate after the crash/reboot

System information

System Information

version core-2024.10.2
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.4
os_name Linux
os_version 6.6.31-haos-raspi
arch aarch64
timezone America/Toronto
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok HACS Data | ok GitHub API Calls Remaining | 5000 Installed Version | 2.0.1 Stage | running Available Repositories | 1444 Downloaded Repositories | 18
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 13.2 -- | -- update_channel | stable supervisor_version | supervisor-2024.10.2 agent_version | 1.6.0 docker_version | 27.2.0 disk_total | 219.4 GB disk_used | 120.9 GB healthy | true supported | true host_connectivity | true supervisor_connectivity | true ntp_synchronized | true virtualization | board | rpi4-64 supervisor_api | ok version_api | ok installed_addons | Home Assistant Google Drive Backup (0.112.1), Terminal & SSH (9.15.0), Studio Code Server (5.17.1), Duck DNS (1.18.0), NGINX Home Assistant SSL proxy (3.11.0), Node-RED (18.1.1), File editor (5.8.0), Z-Wave JS (0.8.0), Matter Server (6.6.0), Mosquitto broker (6.4.1), SQLite Web (4.2.2), eufy-security-ws (1.9.1), Scrypted (v0.118.0-jammy-full), Cloudflared (5.1.21), go2rtc (1.9.4), Increase Swap (1.1.3), Z-Wave.Me Add-on (v4.1.4), ESPHome (2024.9.2)
Dashboards dashboards | 8 -- | -- resources | 7 views | 19 mode | storage
Recorder oldest_recorder_run | September 18, 2024 at 1:22 PM -- | -- current_recorder_run | October 17, 2024 at 8:49 PM estimated_db_size | 642.70 MiB database_engine | sqlite database_version | 3.45.3
Spotify api_endpoint_reachable | ok -- | --

Additional information

I left a journalctl -f session running while I interacted with my installation, and these were the last relevant logs captured before it hang (I had just requested Matter add-on to restart, and I switched to Visual Studio add-on to configure it):

Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service legacy-services: stopping
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service legacy-services successfully stopped
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service legacy-cont-init: stopping
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service matter-server: stopping
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.690 (MainThread) DEBUG [aiorun] Entering shutdown handler
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.691 (MainThread) WARNING [aiorun] Stopping the loop
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.692 (MainThread) INFO [aiorun] Entering shutdown phase.
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.693 (MainThread) INFO [aiorun] Executing provided shutdown_callback.
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.693 (MainThread) INFO [matter_server.server.server] Stopping the Matter Server...
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.696 (MainThread) INFO [matter_server.server.client_handler] [547633667408] Connection closed by client
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service legacy-cont-init successfully stopped
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service fix-attrs: stopping
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: s6-rc: info: service fix-attrs successfully stopped
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.711 (MainThread) DEBUG [matter_server.server.client_handler] [547633667408] Disconnected
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.728 (MainThread) DEBUG [matter_server.server.device_controller] Stopped.
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.728 (MainThread) INFO [matter_server.server.stack] Shutting down the Matter stack...
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.732 (MainThread) CHIP_ERROR [chip.native.CTL] Shutting down the stack...
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.743 (MainThread) CHIP_ERROR [chip.native.DIS] Failed to advertise records: src/inet/UDPEndPointImplSockets.cpp:416: OS Error 0x02000065: Network is unreachable
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.765 (MainThread) CHIP_ERROR [chip.native.DL] Inet Layer shutdown
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.765 (MainThread) CHIP_ERROR [chip.native.DL] BLE shutdown
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.771 (MainThread) CHIP_ERROR [chip.native.DL] System Layer shutdown
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.775 (MainThread) DEBUG [matter_server.server.server] Cleanup complete
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.775 (MainThread) INFO [aiorun] Waiting for executor shutdown.
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.777 (MainThread) INFO [aiorun] Shutting down async generators
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.778 (MainThread) INFO [aiorun] Closing the loop.
Oct 18 00:36:36 homeassistant addon_core_matter_server[592]: 2024-10-17 20:36:36.780 (MainThread) INFO [aiorun] Leaving. Bye!
Oct 18 00:36:37 homeassistant homeassistant[592]: 2024-10-17 20:36:37.273 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:37 homeassistant homeassistant[592]: 2024-10-17 20:36:37.867 ERROR (MainThread) [homeassistant.components.google_assistant.http] Request for https://homegraph.googleapis.com/v1/devices:reportStateAndNotification failed: 404
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.276 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: [00:36:38] INFO: matter-server service exited with code 0 (by signal 0).
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: s6-rc: info: service matter-server successfully stopped
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: s6-rc: info: service banner: stopping
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: s6-rc: info: service banner successfully stopped
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: s6-rc: info: service s6rc-oneshot-runner: stopping
Oct 18 00:36:38 homeassistant addon_core_matter_server[592]: s6-rc: info: service s6rc-oneshot-runner successfully stopped
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.393 WARNING (MainThread) [custom_components.moonraker] connection to moonraker down, restarting
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.421 ERROR (MainThread) [homeassistant.components.rest.data] Error fetching data: http://crealityv3ke.internal:7125/printer/objects/query?heater_bed&extruder&print_stats&toolhead&display_status&virtual_sdcard&gcode_move&webhooks&temperature_sensor mcu_temp&filament_switch_sensor filament_sensor&output_pin fan0&output_pin MainBoardFan failed with All connection attempts failed
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.422 WARNING (MainThread) [homeassistant.components.rest.util] Empty reply found when expecting JSON data
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.471 ERROR (MainThread) [homeassistant.components.rest.data] Error fetching data: http://crealityv3ke.internal:7125/server/files/metadata?filename=unavailable failed with All connection attempts failed
Oct 18 00:36:38 homeassistant homeassistant[592]: 2024-10-17 20:36:38.471 WARNING (MainThread) [homeassistant.components.rest.util] Empty reply found when expecting JSON data
Oct 18 00:36:40 homeassistant homeassistant[592]: 2024-10-17 20:36:40.277 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:41 homeassistant udisksd[137]: Error statting /_swap.swap: No such file or directory
Oct 18 00:36:41 homeassistant homeassistant[592]: 2024-10-17 20:36:41.281 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:41 homeassistant addon_core_configurator[592]: INFO:2024-10-17 20:36:41,281:hass_configurator.configurator:127.0.0.1 - "GET / HTTP/1.1" 200 -
Oct 18 00:36:41 homeassistant addon_a0d7b954_vscode[592]: File not found: /usr/local/lib/code-server/lib/vscode/out/vsda_bg.wasm
Oct 18 00:36:41 homeassistant addon_a0d7b954_vscode[592]: [20:36:41] [172.30.32.2][abcb3741][ManagementConnection] New connection established.
Oct 18 00:36:42 homeassistant addon_a0d7b954_vscode[592]: [20:36:42] [172.30.32.2][694a7696][ExtensionHostConnection] New connection established.
Oct 18 00:36:42 homeassistant homeassistant[592]: 2024-10-17 20:36:42.285 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:42 homeassistant addon_a0d7b954_vscode[592]: [20:36:42] [172.30.32.2][694a7696][ExtensionHostConnection] <49901> Launched Extension Host Process.
Oct 18 00:36:42 homeassistant homeassistant[592]: 2024-10-17 20:36:42.421 ERROR (MainThread) [homeassistant.components.rest.data] Error fetching data: http://crealityv3ke.internal:7125/printer/objects/query?heater_bed&extruder&print_stats&toolhead&display_status&virtual_sdcard&gcode_move&webhooks&temperature_sensor mcu_temp&filament_switch_sensor filament_sensor&output_pin fan0&output_pin MainBoardFan failed with All connection attempts failed
Oct 18 00:36:42 homeassistant systemd[1]: docker-4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22.scope: Deactivated successfully.
Oct 18 00:36:42 homeassistant homeassistant[592]: 2024-10-17 20:36:42.422 WARNING (MainThread) [homeassistant.components.rest.util] Empty reply found when expecting JSON data
Oct 18 00:36:42 homeassistant kernel: audit: type=1334 audit(1729211802.473:1188): prog-id=276 op=UNLOAD
Oct 18 00:36:42 homeassistant audit: BPF prog-id=276 op=UNLOAD
Oct 18 00:36:42 homeassistant systemd[1]: docker-4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22.scope: Consumed 31.591s CPU time.
Oct 18 00:36:42 homeassistant audit: BPF prog-id=279 op=UNLOAD
Oct 18 00:36:42 homeassistant kernel: audit: type=1334 audit(1729211802.493:1189): prog-id=279 op=UNLOAD
Oct 18 00:36:42 homeassistant homeassistant[592]: 2024-10-17 20:36:42.549 ERROR (MainThread) [moonraker_api.websockets.websocketclient] Websocket connection error: Cannot connect to host crealityv3ke.internal:7125 ssl:default [Connect call failed ('192.168.20.121', 7125)]
Oct 18 00:36:42 homeassistant dockerd[592]: time="2024-10-18T00:36:42.612374879Z" level=info msg="ignoring event" container=4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 18 00:36:42 homeassistant containerd[549]: time="2024-10-18T00:36:42.637676235Z" level=info msg="shim disconnected" id=4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22 namespace=moby
Oct 18 00:36:42 homeassistant containerd[549]: time="2024-10-18T00:36:42.652806286Z" level=warning msg="cleaning up after shim disconnected" id=4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22 namespace=moby
Oct 18 00:36:42 homeassistant containerd[549]: time="2024-10-18T00:36:42.652947358Z" level=info msg="cleaning up dead shim" namespace=moby
Oct 18 00:36:44 homeassistant homeassistant[592]: 2024-10-17 20:36:44.286 WARNING (MainThread) [homeassistant.components.sensor] Updating rest sensor took longer than the scheduled update interval 0:00:01
Oct 18 00:36:46 homeassistant dockerd[592]: time="2024-10-18T00:36:46.649498025Z" level=info msg="Container failed to exit within 10s of signal 15 - using the force" container=4423f9809e45d89bd4cc1a35296fe507a3646f7c386da63720dd6514aadbfe22

Oct 18 00:37:20 homeassistant kernel: brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
ZorroBoyMan commented 1 week ago

I can confirm the problems with Studio Code Server Addon. The RAM usage of the Addon went above 15% (4GB on RPI4). After stopping the Addon it's running smooth again. I'm now using File Editor Addon.

tyron commented 1 week ago

I found this bug report in the VSCode add-on, and I'm in the process of fixing all my automations to the new syntax (based on this comment). I'll update here if that solves my issue

tyron commented 1 week ago

Initial testing tells me this was the problem indeed. The syntax check is very resource hungry, and the new syntaxes introduced in 2024.08 and expanded in 2024.10 seem to have caused the crash on RPi4. I'll file a bug report in the home-assistant/core repo, to raise attention and asking if there are plans on doing an automated migration of these things.