Closed kuba2k2 closed 1 month ago
Thanks for the report and the extensive investigation!
As you can see, the architecture of Supervisor (armv7) doesn't match the architecture of Core (armhf). I've used it like that from the start and it wasn't an issue - check_image() was recently introduced in https://github.com/home-assistant/supervisor/pull/4991. This causes Supervisor to download a new Core image every time it starts. I am aware that this configuration is caused by the unsupported OS, but since armv7 code is compatible with armhf it causes no runtime issues.
Hm, yeah so this change essentially assumes Supervisor arch == Core arch.. :thinking: . For add-on we explicitly want to allow the use of a compatible arch, but for Core there is actually not much reason to use a different arch. You loose out on potential performance improvements of the native arch over the compatible arch. I guess in your case the main reason you picked an armhf arch was because there is no generic armv7 machine. Maybe we should introduce one :thinking:
As a work around you can pick any of the machine which base on armv7 instead (see https://github.com/home-assistant/builder/blob/2024.03.5/builder.sh#L37-L59).
IMHO, we could actually solidify the Supervisor arch == Core arch requirement, I don't see a reason why not to use the same arch. But Supervisor should inform the user more gracefully if there is a miss match.
- Supervisor blindly assumes that Core is running when it detects it has been restarted (without the OS being restarted). As you can see, that's not the case when a new Core image is downloaded for whatever reason (like the arch mismatch).
Without judging weather that behavior is "correct" or can be improved, I think the reason for it is so that ha core stop
works across Supervisor restarts. E.g. a user who explicitly wants Core to be stopped (which is probably mostly a development use case), doesn't get surprised by "automatic" restarts.
I found out that there are "generic" images for all architectures - they're called armv7-homeassistant
, armhf-homeassistant
, etc. I switched to the armv7 one and it works just fine - however, it now says that all my add-ons are incompatible and won't let me install any new ones. Additionally, the update JSON that Supervisor checks doesn't have the armv7
machine, which effectively prevents getting HA updates.
I'm not sure what the armv7
machine image is, it doesn't seem to be documented anywhere. Would it be possible to add the missing version numbers for the generic machine images? EDIT: Actually, a better solution would be to add generic machine types for all architectures.
I guess I could just use armhf for both Supervisor and Core, is there a reason why EDIT: I understand now, armhf uses compatibility layers for older CPUs, armv7 is better suited for armv7 CPUs.
I'm afraid if using a Raspberry Pi machine wouldn't introduce some incompatibility, e.g. Core/Supervisor expecting some features only present on an actual Pi.supervised-installer
chose armv7 instead of armhf? My CPU is an Allwinner H3, pretty sure it supports armhf.
I agree that the ha core stop
behavior might be expected. However, I assumed that the role of Supervisor is to... well, supervise the Core, so start it up when it exits unexpectedly. I think that if Core is not running for any reason other than ha core stop
it should be started back up by Supervisor. Right now it won't, even if the container exits/crashes due to any unexpected events.
however, it now says that all my add-ons are incompatible and won't let me install any new ones
That seems to be another bug in the Supervisor code - supervisor/arch.py
:
# Evaluate current CPU/Platform
if not self.sys_machine or self.sys_machine not in arch_data:
_LOGGER.warning("Can't detect the machine type!")
self._default_arch = native_support
self._supported_arch.append(self.default)
return
# Use configs from arch.json
self._supported_arch.extend(arch_data[self.sys_machine])
self._default_arch = self.supported[0]
# Make sure native support is in supported list
if native_support not in self._supported_arch:
self._supported_arch.append(native_support)
self._supported_set = set(self._supported_arch)
Notice how it doesn't populate self._supported_set
if the machine type is not present in data/arch.json
.
This makes def is_supported()
always return False, even for the native_support
arch.
Hello, my friend. My CPU has an aarch64 architecture, and since May, I've been experiencing the same issue as you. Every time I reboot and download a new Core image, or enter recovery mode, it tells me to wait for 20 minutes.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
I also encountered the same problem since May, the CPU is aarch64
Does the same problem still occur to you now? How do you solve it?
I have changed my architecture choice to armv7 and machine to raspberrypi2 (even though it's not really a Raspberry Pi), since there's no generic machine type.
As for the Core reboot, it hasn't occurred to me since then. I didn't change anything in the setup.
I'm having similar problems (at least partial) after updating HA to 2024.6.4 from 2023.11.0 and supervisor to 2024.06.2 from 2023.10.1 on my unsupported installation (HA Supervised on Raspberry Pi OS). Every time the supervisor restarts, all other docker images are "cleaned" (according to the logs, same as the original post) and re-pulled from ghcr.io
. If pulling the full ghcr.io/home-assistant/raspberrypi4-homeassistant:2024.6.4
image fails due to poor network connectivity, a ghcr.io/home-assistant/raspberrypi4-homeassistant:landing_page
will be pulled instead and the whole setup process just stucks there with the supervisor logs showing in the browser, which is super annoying and also wasting networking resources. The images got re-pulled every time include:
ghcr.io/home-assistant/armv7-hassio-cli:2024.05.0
ghcr.io/home-assistant/armv7-hassio-dns:2024.04.0
ghcr.io/home-assistant/armv7-hassio-audio:2023.12.0
ghcr.io/home-assistant/armv7-hassio-observer:2023.06.0
ghcr.io/home-assistant/armv7-hassio-multicast:2022.02.0
ghcr.io/home-assistant/raspberrypi4-homeassistant:2024.6.4
homeassistant/armv7-addon-configurator:5.3.3
This is the only place I can find with people having similar problems. According to the link provided above, my machine
(raspberrypi4
) does not mismatch with the arch
(armv7
). If anyone know how I can solve, work around, or at least debug this, please let me know.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Describe the issue you are experiencing
Disclaimer: I am running an unsupported installation - HA Supervised on Alpine Linux - however, I'm reporting the bug because of its nature. It also applies to supported installations in some cases.
Today my Supervisor container restarted (for no reason, I don't know why, but it's hopefully irrelevant). It started back up, then proceeded to remove the HA Core container & image, followed by downloading it again. However, Supervisor didn't start Core after doing that. It just assumed everything was okay and continued its usual setup, showing that the system is Healthy, while the Core wasn't even running at all.
Here are some relevant log lines from the Supervisor:
After around 2 hours of digging in the Supervisor source code I found two issues causing this behavior:
supervisor.docker.interface.DockerInterface.check_image()
. It is worth noting that my/etc/hassio.json
looks as follows:As you can see, the architecture of Supervisor (armv7) doesn't match the architecture of Core (armhf). I've used it like that from the start and it wasn't an issue -
check_image()
was recently introduced in #4991. This causes Supervisor to download a new Core image every time it starts. I am aware that this configuration is caused by the unsupported OS, but since armv7 code is compatible with armhf it causes no runtime issues.I agree that the likelihood of that issue appearing in any of the supported setups is small, however this still looks like a bug worth reporting.
What type of installation are you running?
Home Assistant Supervised
Which operating system are you running on?
Other (e.g., Raspbian/Raspberry Pi OS/Fedora)
Steps to reproduce the issue
Anything in the Supervisor logs that might be useful for us?
System Health information
nothing
Supervisor diagnostics
No response
Additional information
No response