home-assistant / supervisor

:house_with_garden: Home Assistant Supervisor
https://home-assistant.io/hassio/
Apache License 2.0
1.74k stars 637 forks source link

Supervisor doesn't start Core in certain cases #5050

Closed kuba2k2 closed 1 month ago

kuba2k2 commented 5 months ago

Describe the issue you are experiencing

Disclaimer: I am running an unsupported installation - HA Supervised on Alpine Linux - however, I'm reporting the bug because of its nature. It also applies to supported installations in some cases.

Today my Supervisor container restarted (for no reason, I don't know why, but it's hopefully irrelevant). It started back up, then proceeded to remove the HA Core container & image, followed by downloading it again. However, Supervisor didn't start Core after doing that. It just assumed everything was okay and continued its usual setup, showing that the system is Healthy, while the Core wasn't even running at all.

Here are some relevant log lines from the Supervisor:

2024-04-30 17:35:14.305 INFO (MainThread) [__main__] Initializing Supervisor setup
[...]
2024-04-30 19:35:18.069 INFO (MainThread) [__main__] Setting up Supervisor
[...]
2024-04-30 19:35:22.891 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/qemuarm-homeassistant with version 2024.4.4
2024-04-30 19:35:22.971 INFO (SyncWorker_1) [supervisor.docker.manager] Stopping homeassistant application
2024-04-30 19:35:48.899 INFO (SyncWorker_1) [supervisor.docker.manager] Cleaning homeassistant application
2024-04-30 19:35:52.037 INFO (SyncWorker_0) [supervisor.docker.manager] Removing image ghcr.io/home-assistant/qemuarm-homeassistant with latest
2024-04-30 19:35:52.053 INFO (SyncWorker_0) [supervisor.docker.manager] Removing image ghcr.io/home-assistant/qemuarm-homeassistant with 2024.4.4
2024-04-30 19:37:03.334 INFO (MainThread) [supervisor.docker.interface] Downloading docker image ghcr.io/home-assistant/qemuarm-homeassistant with tag 2024.4.4.
2024-04-30 19:49:24.952 INFO (MainThread) [supervisor.os.manager] No Home Assistant Operating System found
[...]
2024-04-30 19:49:32.030 INFO (MainThread) [__main__] Running Supervisor
2024-04-30 19:49:32.119 WARNING (MainThread) [supervisor.core] System running in a unsupported environment!
[...]
2024-04-30 19:49:32.128 INFO (MainThread) [supervisor.core] Supervisor reboot detected
2024-04-30 19:49:32.133 INFO (MainThread) [supervisor.misc.tasks] All core tasks are scheduled
2024-04-30 19:49:32.139 INFO (MainThread) [supervisor.core] Supervisor is up and running

After around 2 hours of digging in the Supervisor source code I found two issues causing this behavior:

  1. The Core image was stopped and removed in supervisor.docker.interface.DockerInterface.check_image(). It is worth noting that my /etc/hassio.json looks as follows:
    {
    "supervisor": "ghcr.io/home-assistant/armv7-hassio-supervisor",
    "machine": "qemuarm",
    "data": "/usr/share/hassio"
    }

    As you can see, the architecture of Supervisor (armv7) doesn't match the architecture of Core (armhf). I've used it like that from the start and it wasn't an issue - check_image() was recently introduced in #4991. This causes Supervisor to download a new Core image every time it starts. I am aware that this configuration is caused by the unsupported OS, but since armv7 code is compatible with armhf it causes no runtime issues.

  2. Supervisor blindly assumes that Core is running when it detects it has been restarted (without the OS being restarted). As you can see, that's not the case when a new Core image is downloaded for whatever reason (like the arch mismatch).

I agree that the likelihood of that issue appearing in any of the supported setups is small, however this still looks like a bug worth reporting.

What type of installation are you running?

Home Assistant Supervised

Which operating system are you running on?

Other (e.g., Raspbian/Raspberry Pi OS/Fedora)

Steps to reproduce the issue

  1. Start HAOS
  2. Stop the Core container
  3. Reboot the Supervisor container
  4. Observe Supervisor not starting Core at all

Anything in the Supervisor logs that might be useful for us?

```txt [17:35:03] INFO: Starting local supervisor watchdog... 2024-04-30 17:35:14.305 INFO (MainThread) [__main__] Initializing Supervisor setup 2024-04-30 19:35:14.656 INFO (MainThread) [supervisor.bootstrap] Setting up coresys for machine: qemuarm 2024-04-30 19:35:14.679 INFO (MainThread) [supervisor.docker.supervisor] Attaching to Supervisor ghcr.io/home-assistant/armv7-hassio-supervisor with version 2024.04.4 2024-04-30 19:35:14.680 INFO (MainThread) [supervisor.docker.supervisor] Connecting Supervisor to hassio-network 2024-04-30 19:35:15.264 INFO (SyncWorker_0) [supervisor.docker.manager] Cleanup images: ['ghcr.io/home-assistant/armv7-hassio-supervisor:2024.04.0'] 2024-04-30 19:35:18.044 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state initialize 2024-04-30 19:35:18.055 WARNING (MainThread) [supervisor.resolution.evaluations.docker_configuration] Docker logging driver json-file is not supported! 2024-04-30 19:35:18.056 WARNING (MainThread) [supervisor.resolution.evaluations.base] The configuration of Docker is not supported (more-info: https://www.home-assistant.io/more-info/unsupported/docker_configuration) 2024-04-30 19:35:18.056 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 19:35:18.069 INFO (MainThread) [__main__] Setting up Supervisor 2024-04-30 19:35:18.551 INFO (MainThread) [supervisor.api] Starting API on 172.30.32.2 2024-04-30 19:35:19.043 INFO (MainThread) [supervisor.hardware.monitor] Started Supervisor hardware monitor 2024-04-30 19:35:19.063 INFO (MainThread) [supervisor.dbus.manager] Connected to system D-Bus. 2024-04-30 19:35:19.064 INFO (MainThread) [supervisor.dbus.agent] Load dbus interface io.hass.os 2024-04-30 19:35:19.070 INFO (MainThread) [supervisor.dbus.hostname] Load dbus interface org.freedesktop.hostname1 2024-04-30 19:35:19.073 INFO (MainThread) [supervisor.dbus.logind] Load dbus interface org.freedesktop.login1 2024-04-30 19:35:19.074 INFO (MainThread) [supervisor.dbus.network] Load dbus interface org.freedesktop.NetworkManager 2024-04-30 19:35:19.076 INFO (MainThread) [supervisor.dbus.rauc] Load dbus interface de.pengutronix.rauc 2024-04-30 19:35:19.077 INFO (MainThread) [supervisor.dbus.resolved] Load dbus interface org.freedesktop.resolve1 2024-04-30 19:35:19.079 INFO (MainThread) [supervisor.dbus.systemd] Load dbus interface org.freedesktop.systemd1 2024-04-30 19:35:19.081 INFO (MainThread) [supervisor.dbus.timedate] Load dbus interface org.freedesktop.timedate1 2024-04-30 19:35:19.113 WARNING (MainThread) [supervisor.dbus.rauc] Host has no rauc support. OTA updates have been disabled. 2024-04-30 19:35:19.114 WARNING (MainThread) [supervisor.dbus.resolved] Host has no systemd-resolved support. DNS will not work correctly. 2024-04-30 19:35:19.361 WARNING (MainThread) [supervisor.dbus.timedate] Can't connect to systemd-timedate 2024-04-30 19:35:19.991 INFO (MainThread) [supervisor.host.services] Updating service information 2024-04-30 19:35:20.361 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information 2024-04-30 19:35:21.220 INFO (MainThread) [supervisor.host.network] Updating local network information 2024-04-30 19:35:21.726 INFO (MainThread) [supervisor.host.apparmor] Loading AppArmor Profiles: {'hassio-supervisor'} 2024-04-30 19:35:22.998 INFO (MainThread) [supervisor.docker.monitor] Started docker events monitor 2024-04-30 19:35:22.000 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json 2024-04-30 19:35:22.176 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/armv7-hassio-cli with version 2024.04.1 2024-04-30 19:35:22.395 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/armv7-hassio-dns with version 2024.04.0 2024-04-30 19:35:22.506 INFO (MainThread) [supervisor.plugins.dns] Updated /etc/resolv.conf 2024-04-30 19:35:22.583 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/armv7-hassio-audio with version 2023.12.0 2024-04-30 19:35:22.669 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/armv7-hassio-observer with version 2023.06.0 2024-04-30 19:35:22.763 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/armv7-hassio-multicast with version 2024.03.0 2024-04-30 19:35:22.853 INFO (MainThread) [supervisor.homeassistant.secrets] Loaded 1 Home Assistant secrets 2024-04-30 19:35:22.891 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/qemuarm-homeassistant with version 2024.4.4 2024-04-30 19:35:22.971 INFO (SyncWorker_1) [supervisor.docker.manager] Stopping homeassistant application 2024-04-30 19:35:48.899 INFO (SyncWorker_1) [supervisor.docker.manager] Cleaning homeassistant application 2024-04-30 19:35:52.037 INFO (SyncWorker_0) [supervisor.docker.manager] Removing image ghcr.io/home-assistant/qemuarm-homeassistant with latest 2024-04-30 19:35:52.053 INFO (SyncWorker_0) [supervisor.docker.manager] Removing image ghcr.io/home-assistant/qemuarm-homeassistant with 2024.4.4 2024-04-30 19:37:03.334 INFO (MainThread) [supervisor.docker.interface] Downloading docker image ghcr.io/home-assistant/qemuarm-homeassistant with tag 2024.4.4. 2024-04-30 19:49:24.952 INFO (MainThread) [supervisor.os.manager] No Home Assistant Operating System found 2024-04-30 19:49:27.830 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/a0d7b954 repository 2024-04-30 19:49:27.833 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/243ffc37 repository 2024-04-30 19:49:27.850 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/core repository 2024-04-30 19:49:27.866 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/5c53de3b repository 2024-04-30 19:49:30.720 INFO (MainThread) [supervisor.store] Loading add-ons from store: 96 all - 96 new - 0 remove 2024-04-30 19:49:31.079 INFO (MainThread) [supervisor.addons.manager] Found 6 installed add-ons 2024-04-30 19:49:31.263 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/armhf-addon-ssh with version 9.9.0 2024-04-30 19:49:31.380 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/poeschl/ha-syncthing-armhf with version 1.18.2 2024-04-30 19:49:31.385 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/armhf-addon-samba with version 12.3.0 2024-04-30 19:49:31.396 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/hassio-addons/adguard/armv7 with version 5.0.3 2024-04-30 19:49:31.424 INFO (MainThread) [supervisor.docker.interface] Attaching to ghcr.io/esphome/esphome-hassio with version 2024.4.1 2024-04-30 19:49:31.429 INFO (MainThread) [supervisor.docker.interface] Attaching to homeassistant/armhf-addon-mosquitto with version 6.4.0 2024-04-30 19:49:31.928 INFO (MainThread) [supervisor.backups.manager] Found 4 backup files 2024-04-30 19:49:31.988 INFO (MainThread) [supervisor.discovery] Loaded 3 messages 2024-04-30 19:49:32.989 INFO (MainThread) [supervisor.ingress] Loaded 0 ingress sessions 2024-04-30 19:49:32.990 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state setup 2024-04-30 19:49:32.999 INFO (MainThread) [supervisor.resolution.checks.base] Run check for disabled_data_disk/system 2024-04-30 19:49:32.000 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system 2024-04-30 19:49:32.000 INFO (MainThread) [supervisor.resolution.check] System checks complete 2024-04-30 19:49:32.001 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state setup 2024-04-30 19:49:32.002 WARNING (MainThread) [supervisor.resolution.evaluations.base] Detected unsupported OS: Alpine Linux v3.19 (more-info: https://www.home-assistant.io/more-info/unsupported/os) 2024-04-30 19:49:32.003 WARNING (MainThread) [supervisor.resolution.evaluations.base] Systemd-Resolved is required for DNS in Home Assistant. (more-info: https://www.home-assistant.io/more-info/unsupported/systemd_resolved) 2024-04-30 19:49:32.018 WARNING (MainThread) [supervisor.resolution.evaluations.base] Docker cgroup version 2 is not supported! {'1'} (more-info: https://www.home-assistant.io/more-info/unsupported/cgroup_version) 2024-04-30 19:49:32.019 WARNING (MainThread) [supervisor.resolution.evaluations.base] Systemd journal is not working correctly or inaccessible (more-info: https://www.home-assistant.io/more-info/unsupported/systemd_journal) 2024-04-30 19:49:32.019 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 19:49:32.020 INFO (MainThread) [supervisor.jobs] 'ResolutionFixup.run_autofix' blocked from execution, system is not running - setup 2024-04-30 19:49:32.021 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state setup 2024-04-30 19:49:32.022 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 19:49:32.030 INFO (MainThread) [__main__] Running Supervisor 2024-04-30 19:49:32.119 WARNING (MainThread) [supervisor.core] System running in a unsupported environment! 2024-04-30 19:49:32.120 INFO (MainThread) [supervisor.jobs] 'OSManager.mark_healthy' blocked from execution, no Home Assistant OS available 2024-04-30 19:49:32.122 INFO (MainThread) [supervisor.addons.manager] Phase 'initialize' starting 0 add-ons 2024-04-30 19:49:32.128 INFO (MainThread) [supervisor.core] Supervisor reboot detected 2024-04-30 19:49:32.133 INFO (MainThread) [supervisor.misc.tasks] All core tasks are scheduled 2024-04-30 19:49:32.139 INFO (MainThread) [supervisor.core] Supervisor is up and running 2024-04-30 19:49:32.140 INFO (MainThread) [supervisor.host.info] Updating local host information 2024-04-30 19:49:32.160 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running 2024-04-30 19:49:32.168 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system 2024-04-30 19:49:32.169 INFO (MainThread) [supervisor.resolution.checks.base] Run check for disabled_data_disk/system 2024-04-30 19:49:32.170 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server 2024-04-30 19:49:32.283 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon 2024-04-30 19:49:32.610 WARNING (MainThread) [supervisor.host.info] Can't update host system information! 2024-04-30 19:49:32.612 INFO (MainThread) [supervisor.jobs] 'OSManager.reload' blocked from execution, no Home Assistant OS available 2024-04-30 19:49:32.613 INFO (MainThread) [supervisor.host.services] Updating service information 2024-04-30 19:49:32.943 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json 2024-04-30 19:49:33.115 INFO (MainThread) [supervisor.host.network] Updating local network information 2024-04-30 19:49:33.168 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system 2024-04-30 19:49:33.169 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system 2024-04-30 19:49:33.170 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system 2024-04-30 19:49:33.170 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server 2024-04-30 19:49:33.261 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core 2024-04-30 19:49:33.272 INFO (MainThread) [supervisor.resolution.checks.base] Run check for no_current_backup/system 2024-04-30 19:49:33.273 INFO (MainThread) [supervisor.resolution.module] Create new suggestion create_full_backup - system / None 2024-04-30 19:49:33.273 INFO (MainThread) [supervisor.resolution.module] Create new issue no_current_backup - system / None 2024-04-30 19:49:33.274 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor 2024-04-30 19:49:33.427 INFO (MainThread) [supervisor.resolution.check] System checks complete 2024-04-30 19:49:33.428 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running 2024-04-30 19:49:34.421 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information 2024-04-30 19:49:34.768 INFO (MainThread) [supervisor.host.manager] Host information reload completed 2024-04-30 19:49:36.031 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 19:49:36.033 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running 2024-04-30 19:49:36.034 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete 2024-04-30 20:05:41.912 INFO (SyncWorker_1) [supervisor.docker.manager] Runing command 'python3 -m homeassistant -c /config --script check_config' on ghcr.io/home-assistant/qemuarm-homeassistant 2024-04-30 20:06:21.594 ERROR (MainThread) [supervisor.host.logs] No systemd-journal-gatewayd Unix socket available 2024-04-30 20:18:02.746 INFO (MainThread) [supervisor.homeassistant.core] Home Assistant config is valid 2024-04-30 20:49:32.022 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running 2024-04-30 20:49:32.028 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system 2024-04-30 20:49:32.033 INFO (MainThread) [supervisor.resolution.checks.base] Run check for disabled_data_disk/system 2024-04-30 20:49:32.034 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server 2024-04-30 20:49:32.246 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon 2024-04-30 20:49:32.248 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system 2024-04-30 20:49:32.248 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system 2024-04-30 20:49:32.250 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system 2024-04-30 20:49:32.250 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server 2024-04-30 20:49:32.252 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core 2024-04-30 20:49:32.258 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor 2024-04-30 20:49:32.425 INFO (MainThread) [supervisor.resolution.check] System checks complete 2024-04-30 20:49:32.426 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running 2024-04-30 20:49:35.811 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 20:49:35.813 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running 2024-04-30 20:49:35.814 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete 2024-04-30 21:49:32.233 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json 2024-04-30 21:49:35.831 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running 2024-04-30 21:49:35.832 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system 2024-04-30 21:49:35.835 INFO (MainThread) [supervisor.resolution.checks.base] Run check for disabled_data_disk/system 2024-04-30 21:49:35.835 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server 2024-04-30 21:49:35.845 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon 2024-04-30 21:49:35.847 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system 2024-04-30 21:49:35.855 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system 2024-04-30 21:49:35.856 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system 2024-04-30 21:49:35.857 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server 2024-04-30 21:49:35.859 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core 2024-04-30 21:49:35.865 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor 2024-04-30 21:49:35.914 INFO (MainThread) [supervisor.resolution.check] System checks complete 2024-04-30 21:49:35.914 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running 2024-04-30 21:49:36.676 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete 2024-04-30 21:49:36.678 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running 2024-04-30 21:49:36.686 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete ```

System Health information

nothing

Supervisor diagnostics

No response

Additional information

No response

agners commented 5 months ago

Thanks for the report and the extensive investigation!

As you can see, the architecture of Supervisor (armv7) doesn't match the architecture of Core (armhf). I've used it like that from the start and it wasn't an issue - check_image() was recently introduced in https://github.com/home-assistant/supervisor/pull/4991. This causes Supervisor to download a new Core image every time it starts. I am aware that this configuration is caused by the unsupported OS, but since armv7 code is compatible with armhf it causes no runtime issues.

Hm, yeah so this change essentially assumes Supervisor arch == Core arch.. :thinking: . For add-on we explicitly want to allow the use of a compatible arch, but for Core there is actually not much reason to use a different arch. You loose out on potential performance improvements of the native arch over the compatible arch. I guess in your case the main reason you picked an armhf arch was because there is no generic armv7 machine. Maybe we should introduce one :thinking:

As a work around you can pick any of the machine which base on armv7 instead (see https://github.com/home-assistant/builder/blob/2024.03.5/builder.sh#L37-L59).

IMHO, we could actually solidify the Supervisor arch == Core arch requirement, I don't see a reason why not to use the same arch. But Supervisor should inform the user more gracefully if there is a miss match.

  1. Supervisor blindly assumes that Core is running when it detects it has been restarted (without the OS being restarted). As you can see, that's not the case when a new Core image is downloaded for whatever reason (like the arch mismatch).

Without judging weather that behavior is "correct" or can be improved, I think the reason for it is so that ha core stop works across Supervisor restarts. E.g. a user who explicitly wants Core to be stopped (which is probably mostly a development use case), doesn't get surprised by "automatic" restarts.

kuba2k2 commented 5 months ago

I found out that there are "generic" images for all architectures - they're called armv7-homeassistant, armhf-homeassistant, etc. I switched to the armv7 one and it works just fine - however, it now says that all my add-ons are incompatible and won't let me install any new ones. Additionally, the update JSON that Supervisor checks doesn't have the armv7 machine, which effectively prevents getting HA updates.

I'm not sure what the armv7 machine image is, it doesn't seem to be documented anywhere. Would it be possible to add the missing version numbers for the generic machine images? EDIT: Actually, a better solution would be to add generic machine types for all architectures.

I guess I could just use armhf for both Supervisor and Core, is there a reason why supervised-installer chose armv7 instead of armhf? My CPU is an Allwinner H3, pretty sure it supports armhf. EDIT: I understand now, armhf uses compatibility layers for older CPUs, armv7 is better suited for armv7 CPUs. I'm afraid if using a Raspberry Pi machine wouldn't introduce some incompatibility, e.g. Core/Supervisor expecting some features only present on an actual Pi.


I agree that the ha core stop behavior might be expected. However, I assumed that the role of Supervisor is to... well, supervise the Core, so start it up when it exits unexpectedly. I think that if Core is not running for any reason other than ha core stop it should be started back up by Supervisor. Right now it won't, even if the container exits/crashes due to any unexpected events.

kuba2k2 commented 5 months ago

however, it now says that all my add-ons are incompatible and won't let me install any new ones

That seems to be another bug in the Supervisor code - supervisor/arch.py:

        # Evaluate current CPU/Platform
        if not self.sys_machine or self.sys_machine not in arch_data:
            _LOGGER.warning("Can't detect the machine type!")
            self._default_arch = native_support
            self._supported_arch.append(self.default)
            return

        # Use configs from arch.json
        self._supported_arch.extend(arch_data[self.sys_machine])
        self._default_arch = self.supported[0]

        # Make sure native support is in supported list
        if native_support not in self._supported_arch:
            self._supported_arch.append(native_support)

        self._supported_set = set(self._supported_arch)

Notice how it doesn't populate self._supported_set if the machine type is not present in data/arch.json. This makes def is_supported() always return False, even for the native_support arch.

gmshiwoge commented 5 months ago

Hello, my friend. My CPU has an aarch64 architecture, and since May, I've been experiencing the same issue as you. Every time I reboot and download a new Core image, or enter recovery mode, it tells me to wait for 20 minutes.

github-actions[bot] commented 4 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

sunshine-hass commented 4 months ago

I also encountered the same problem since May, the CPU is aarch64

sunshine-hass commented 4 months ago

Does the same problem still occur to you now? How do you solve it?

kuba2k2 commented 4 months ago

I have changed my architecture choice to armv7 and machine to raspberrypi2 (even though it's not really a Raspberry Pi), since there's no generic machine type.

As for the Core reboot, it hasn't occurred to me since then. I didn't change anything in the setup.

zerychao commented 3 months ago

I'm having similar problems (at least partial) after updating HA to 2024.6.4 from 2023.11.0 and supervisor to 2024.06.2 from 2023.10.1 on my unsupported installation (HA Supervised on Raspberry Pi OS). Every time the supervisor restarts, all other docker images are "cleaned" (according to the logs, same as the original post) and re-pulled from ghcr.io. If pulling the full ghcr.io/home-assistant/raspberrypi4-homeassistant:2024.6.4 image fails due to poor network connectivity, a ghcr.io/home-assistant/raspberrypi4-homeassistant:landing_page will be pulled instead and the whole setup process just stucks there with the supervisor logs showing in the browser, which is super annoying and also wasting networking resources. The images got re-pulled every time include:

This is the only place I can find with people having similar problems. According to the link provided above, my machine (raspberrypi4) does not mismatch with the arch (armv7). If anyone know how I can solve, work around, or at least debug this, please let me know.

github-actions[bot] commented 2 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.