home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
5.09k stars 992 forks source link

Update to 12.3 prevents boot on Fujitsu Esprimo Q920 #3348

Closed asciinaut closed 4 months ago

asciinaut commented 6 months ago

Describe the issue you are experiencing

Updated Home Assistant OS from 12.2 to 12.3.

After reboot system doesn't boot properly

Sometimes it boots into rescue mode automatically if no option is chosen manually after reboot. Sometimes it shows:

Booting "Slot B (recue shell)"

Trying to terminate EFI services again
error: couldn't retrieve memory map.

Failed to boot both default and fallback entries.

Choosing the slots manually neither of the slots (Slot A, Slot B nor the respective rescue slots) boot.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

inaccessible

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Start the announced update from 12.2 to 12.3 in the Webinterface
  2. wait for the reboot

Anything in the Supervisor logs that might be useful for us?

inaccessible

Anything in the Host logs that might be useful for us?

inaccessible

System information

inaccessible

Additional information

No response

asciinaut commented 6 months ago

probably related to #3347 ?

agners commented 6 months ago

probably related to #3347 ?

Unlikely, as this is Home Assistant Yellow.

This is probably related to the GRUB revert in #3324.

What x86-64 machine are you running on?

Sometimes it boots into rescue mode automatically

What rescue mode exactly? Is ha os info working in that shell?

Probably your best way forward here is to replace the GRUB bootloader on the first partition of your boot disk (e.g. using a Ubuntu Live USB flash drive). You can find older version of the GRUB bootloader capable of booting HAOS in this comment https://github.com/home-assistant/operating-system/issues/3305#issuecomment-2055918375.

asciinaut commented 6 months ago

Thank you very much for your answer.

What x86-64 machine are you running on?

Fujitsu Esprimo Q920 - Intel Core i5 4590T

What rescue mode exactly? Is ha os info working in that shell?

This does not work in the rescue shell but I can exit the rescue shell which makes the supervisor start. In that case it is HAOS 12.2 so I assume I starts from Slot B rescue. From that point on I have access to the system and can provide additional information.

➜  ~ ha os info
board: generic-x86-64
boot: B
boot_slots:
  A:
    state: inactive
    status: bad
    version: "12.3"
  B:
    state: booted
    status: good
    version: "12.2"
data_disk: MicroFrom-256GB-SATA3-SSD-07042223E0108
update_available: true
version: "12.2"
version_latest: "12.3"

Start into this rescue mode is successfull in about 1 out of 10 boot attempts. Otherwise I only see the mentioned error.

asciinaut commented 6 months ago

Adding system information of 12.2 startet from Slot B rescue:

System Information

version core-2024.5.2
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.6.25-haos
arch x86_64
timezone Europe/Berlin
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.34.0 Stage | running Available Repositories | 1467 Downloaded Repositories | 14 HACS Data | ok
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 12.2 -- | -- update_channel | stable supervisor_version | supervisor-2024.05.1 agent_version | 1.6.0 docker_version | 25.0.5 disk_total | 234.0 GB disk_used | 13.5 GB healthy | true supported | true board | generic-x86-64 supervisor_api | ok version_api | ok installed_addons | Z-Wave JS (0.5.0), File editor (5.8.0), Z-Wave JS UI (3.6.0), Network UPS Tools (0.13.0), Advanced SSH & Web Terminal (17.2.0), Piper (1.5.0), Whisper (2.0.0), openWakeWord (1.10.0), RaspberryMatic CCU (3.75.7.20240420), Assist Microphone (1.2.0), Mosquitto broker (6.4.0)
Dashboards dashboards | 8 -- | -- resources | 3 views | 18 mode | storage
Recorder oldest_recorder_run | 1. Mai 2024 um 06:00 -- | -- current_recorder_run | 8. Mai 2024 um 20:45 estimated_db_size | 278.83 MiB database_engine | sqlite database_version | 3.44.2
agners commented 6 months ago

Hm, weird, so the new GRUB is able to boot HAOS 12.2, but not 12.3? :thinking: I wonder if boot slot A (or the kernel partition thereof) is somehow corrupted. When in boot slot B, can you just try to install HAOS 12.3 again? It should try to install it to the boot slot A again:

ha os update --version 12.3
asciinaut commented 6 months ago
➜  ~ ha os update --version 12.3
Processing... Done.

Command completed successfully.

After the reboot the problem persists.

I will transfer an image of the SSD to another identical esprimo and downgrade the production one to 12.2.

If I can reproduce the problem on the other esprimo, I will try older grub images as suggested in https://github.com/home-assistant/operating-system/issues/3305#issuecomment-2055918375 starting with the 32-bit EFI files and then 64-bit ones to check if this is the same behaviour.

sairon commented 6 months ago

@asciinaut Can you also check if you're using the latest BIOS (as it sometimes resolves some weird UEFI boot issues) and try booting a fresh 12.3 install e.g. from an USB thumb drive?

asciinaut commented 6 months ago

@sairon booting from a fresh 12.3 USB thunb drive has the same issues on two identical esprimo.

However both have a BIOS update pending. Will update one to see if the issue persists.

asciinaut commented 6 months ago

A BIOS update did not fix the problem. I'm waiting for the images to finish and then continue with the older GRUB images.

For reference the Q920 BIOS versions tested:

kimzeuner commented 6 months ago

Just wanted to report that i had exactly the same behaviour with my Q920. BIOS updates didnt work for me too. Only solution was to replace the GRUB Files via ubuntu. Unfortunately i have already replaced both an can not report if only changing one of them would help.

Botschafter commented 6 months ago

Just wanted to report that i had exactly the same behaviour with my Q920. BIOS updates didnt work for me too. Only solution was to replace the GRUB Files via ubuntu. Unfortunately i have already replaced both an can not report if only changing one of them would help.

Can you please describe how to replace the GRUB Files via ubuntu? I have the same Problems with my Q920 since Updating to HAOS.

kimzeuner commented 6 months ago

Sure. I downloaded the latest ubuntu version and flashed it on an usb stick with balena etcher. Put the stick into your esprimo and start it. Use F12 to open the boot menu and select the usb stick as boot device. After some time ubuntu will start. Open the firefox browser and download the "old" GRUB files from here Opened the terminal with a right click on the desktop and "open in terminal" create a new folder with sudo mkdir /mount01 Then mount the /EFI/BOOT folder to your created folder with sudo mount /dev/sda1 /mount01 (maybe the /dev/sda1 ist different on your sytem but i think it will be the same as you also have an esprimo.) Cooy the downloaded files with sudo cp command In my case it was something like sudo cp /home/ubuntu/downloads/grub/xxx.efi /mount01 Repeat that for both files (you will not see a message that it was succesfull)

That way is also described here

Shutdown ubuntu and restart the esprimo. It will take some minutes till your ha is available again.

Botschafter commented 6 months ago

....

Shutdown ubuntu and restart the esprimo. It will take some minutes till your ha is available again.

Thank You Very much. Got Control Back again.

asciinaut commented 6 months ago

@agners I have tested the 12.1 grub files and can confirm that the 64-bit grub image fixed the problem. The 32-bit version have the same problem as before.

For that reason it think it is indeed related to https://github.com/home-assistant/operating-system/issues/3305#issuecomment-2055918375.

What irritates me, is that unlike described in the other issue, the grub loader from version 12.2 started without any problems. So it doesn't seem to be the exact same problem. @Botschafter, @kimzeuner can you confirm that the Grub from version 12.2 also booted smoothly on your Q920?

kimzeuner commented 6 months ago

Yes, i can confirm that. Im currently running 12.2 without any problems.

xmancz commented 6 months ago

I try it, but no works.

I don't know if I'm using the right /dev/sda...

How do I know which one is the right one? 20240512_180720

kimzeuner commented 6 months ago

Im not an expert but as i have read in the other issue it should always be the first partition with 32M so /dev/sda1 should be the right one for you. I think in my system it looked similar to yours in the screenshot

clmnsllwrmn commented 5 months ago

Maybe relevant - maybe not: On my Fujitsu q920 i5 I now also had the described problem and it seems to have something to do with the connected USB devices:

coc commented 5 months ago

I have the same problem on a ThinkCentre m93 p, I already wrote here: https://github.com/home-assistant/operating-system/issues/3376 If I install 12.3 and then go back to 12.2, and then install 12.3 again from HA UI, HA works. If I install then my backup and restart the ThinClient, it also works. But if I restart a second time, I cannot boot haos from slot A or B. I also installed Proxmox 8.2 and HA as a VM, and I also could not restart proxmox twice.

Now I hope I have a solution for me, but not for haos itself: I found that proxmox can be rebooted multiple times if I use legacy bios mode. With uefi mode not. And now I also can restart Home assistant VM multiple times without problems. Will see what happens if a next os update is out and the boot slots are exchanged!

RenEdi commented 5 months ago

I have the same problem, helped me to run in Terminal "ha os update --version 12.1" and "ha core update --version 2024.6.0"

Fujitsu Esprimo Q920 works again, as before

MarvinMynx commented 5 months ago

Having the same Grub Boot Freezing Error again with the current update...

mag-sruehl commented 5 months ago

Maybe relevant - maybe not: On my Fujitsu q920 i5 I now also had the described problem and it seems to have something to do with the connected USB devices:

...

* no connected USB device: boot
  Regards

I can confirm this observation. Removing all usb devices made it boot for me. Which was very useful to revert the upgrade.

MarvinMynx commented 5 months ago

How is this still not fixed? How could 2 Updates in a row from the GUI Break the System completely in a way that it is not bootable any more in a row?! I don't get it. How could you guys break things with an Update, we tell you, and the next update breaks the exact same thing?! how?!

clmnsllwrmn commented 5 months ago

How is this still not fixed? How could 2 Updates in a row from the GUI Break the System completely in a way that it is not bootable any more in a row?! I don't get it. How could you guys break things with an Update, we tell you, and the next update breaks the exact same thing?! how?!

As far as I remember, the q920 already required some intervention during the initial installation, as the BIOS is probably a bit "special". Is this a mistake on the part of the HAOS developers? As of today and with today's knowledge, I would no longer go for the q920 and not blame the developers of HAOS. And: if I already have a problem with a version, then I'll have a look at the release notes before the next update: "Fujitsu Esprimo Q920 fails to boot with GRUB bootloader distributed in this OS release, updating to 12.4.rc1 may cause a boot failure and manual intervention might be needed, see #3348 for details." What did you expect?

MarvinMynx commented 5 months ago

How is this still not fixed? How could 2 Updates in a row from the GUI Break the System completely in a way that it is not bootable any more in a row?! I don't get it. How could you guys break things with an Update, we tell you, and the next update breaks the exact same thing?! how?!

As far as I remember, the q920 already required some intervention during the initial installation, as the BIOS is probably a bit "special". Is this a mistake on the part of the HAOS developers? As of today and with today's knowledge, I would no longer go for the q920 and not blame the developers of HAOS. And: if I already have a problem with a version, then I'll have a look at the release notes before the next update: "Fujitsu Esprimo Q920 fails to boot with GRUB bootloader distributed in this OS release, updating to 12.4.rc1 may cause a boot failure and manual intervention might be needed, see #3348 for details." What did you expect?

There where no intervention during the initial installation needed. Never. Ever. And: What manual intervention might be needed exactly and when? With Every update from now on?! Because the manual intervention where this AFTER it broke:

Sure. I downloaded the latest ubuntu version and flashed it on an usb stick with balena etcher. Put the stick into your esprimo and start it. Use F12 to open the boot menu and select the usb stick as boot device. After some time ubuntu will start. Open the firefox browser and download the "old" GRUB files from here Opened the terminal with a right click on the desktop and "open in terminal" create a new folder with sudo mkdir /mount01 Then mount the /EFI/BOOT folder to your created folder with sudo mount /dev/sda1 /mount01 (maybe the /dev/sda1 ist different on your sytem but i think it will be the same as you also have an esprimo.) Cooy the downloaded files with sudo cp command In my case it was something like sudo cp /home/ubuntu/downloads/grub/xxx.efi /mount01 Repeat that for both files (you will not see a message that it was succesfull)

That way is also described here

Shutdown ubuntu and restart the esprimo. It will take some minutes till your ha is available again.

How is it Possible that even this is not clear by now?!

What did you expect?

i expect it not breaking when updating from GUI. How would that be?! as it were the case before for years...

Is this a mistake on the part of the HAOS developers?

yes it is. And it was with the last Update.

I hope that was clear enough now... In general you guys do great stuff, but this here is shit...

clmnsllwrmn commented 5 months ago

Well, a lot of users of the q920 (me included) ran into trouble during installation / first boot of HAOS like describe here: https://github.com/home-assistant/operating-system/issues/1760 (similar: https://community.simon42.com/t/haos-x86-bootet-nicht-von-fujitsu-esprimo-q920/534 and in principle the same problem I assume here: https://github.com/AlmaLinux/almalinux-deploy/issues/31 )

I don't understand the details and the source of the problem but it is obvious that the BIOS of the q920 is somehow more problematic than the BIOSes of lots of other x86-64 computer models.

Yes, it would be nice to get a fix. But the q920 in my opinion is not the ideal hardware to run HAOS on it - it causes too much trouble.

And once again: If your q920 does not boot, try to pull out all USB devices. Maybe you're in luck and it boots again. Makes troubleshooting a lot easier.

MarvinMynx commented 5 months ago

I bought the Hardware Refurbished in 2021 from Amazon Marketplace and never had any issues or Manuel intervention needed. I never made any BIOS Updates or Settings or opend the case... I don't know what you are talking about, but i installed the HAOS on this Hardware without any issues ever and it ran for years and i did every update with Rebooting and Everything. And out of nothing the Hardware is to Blame, when Updating from GUI whcih where never a Problem?! Because an Update breaks Grub?!

What is wrong with you?!

Botschafter commented 5 months ago

That's why I'm skipping these OS updates until the problem has been resolved. Can anyone tell me what new features have been implemented in OS 12.4? For now, I'm sticking with the working version 12.3 and, when I have a lot of time, I'll do some sandbox testing before updating with my mirrored system hard drive. I don't think it's worth wasting time getting the system up running again after updates preventing the boot. I'm sure there will be a solution sooner or later. I will not replace the well-functioning and inexpensive Q920 hardware just because of this.

mag-sruehl commented 5 months ago

What is wrong with you?!

Frankly, arguing on such a personal level will not help us Eprimo users getting the issue fixed. We did not pay anything for a really great software. We agreed that the software is provided "as is" when we started using it. On the other hand, developers invest their free time into this project. The least thing they can expect from users is a friendly tone in discussions. I would get myself a new hobby if people would talk to me like this.

saschachina commented 5 months ago

What is wrong with you?!

Frankly, arguing on such a personal level will not help us Eprimo users getting the issue fixed. We did not pay anything for a really great software. We agreed that the software is provided "as is" when we started using it. On the other hand, developers invest their free time into this project. The least thing they can expect from users is a friendly tone in discussions. I would get myself a new hobby if people would talk to me like this.

Are we now done criticising someone who has rightly criticised? He explained it clearly, why they release an update and get some users into massive trouble? We are not toys or experimental rabbits, we are grateful to have this platform and play our part in making it better, we all test, try and optimise and share our knowledge in the community. But to completely bring down something like this update is unacceptable and I condemn it in the strongest possible terms. Find a solution and do it immediately!

Matze2208 commented 4 months ago

Das Problem scheint weiterhin zu bestehen :-1: auch mit der Version 12.7 ich hoffe das sich da bald was tut wäre echt traurig wein diese Hardware nicht mehr unterstützt werden würde. Vor allem da er so zuverlässig läuft.

xmancz commented 4 months ago

however, version OS 12.4 is just out, not 12.7

countrr commented 4 months ago

I am reluctant to update because I am not sure if this issue applies to my situation: I am running Home Assistant inside Oracle VM VirtualBox, which runs on a Fujitsu Esprimo Q920 Intel Core i5-4590T with Windows 10 Pro 22H2. Inside Oracle VM VirtualBox it says: Name: VM4HA Operating System: Linux 2.6 / 3.x / 4.x / 5.x (64 bit)

Should I wait until this issue has been fixed, or is it irrelevant for my situation? Your help is apprecicated, thanks!

sairon commented 4 months ago

@countrr No, this bug only affects the generic-x86-64 OS version. In your case the hardware is fully virtualized by the VirtualBox hypervisor, so it mostly doesn't matter what machine is it running on. In this case it should be safe to upgrade.

countrr commented 4 months ago

@sairon Thanks for clarifying!

Flori123456789 commented 3 months ago

Is the problem solved in OS 13.0?

TK2020git commented 3 months ago

Is the problem solved in OS 13.0?

For me the update to 13.0 worked.

saschachina commented 3 months ago

Is the problem solved in OS 13.0?

I updated and seems no problem

xmancz commented 3 months ago

I update my Fujitsu Q920 and no problem

Botschafter commented 3 months ago

Working on my Q920 too Screenshot_2024-08-15-10-45-19-32_c3a231c25ed346e59462e84656a70e50

sairon commented 3 months ago

@Flori123456789 Yes, solved since this revert which is included in 13.0:

So it's not just a coincidence :) Thanks for the feedback, rest of you!