home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.96k stars 31.01k forks source link

Raspberry Pi Reboot error - V #129714

Open Dri878 opened 3 weeks ago

Dri878 commented 3 weeks ago

The problem

When rebooting Home Assistant on the raspberry pi 5 with an NVMe HAT the device fails to detect the NVMe on startup. This issue only occurs when rebooting, when powering off the device and then starting up the device by pressing the power button it starts up as normal. Screenshot from 2024-11-03 07-58-04

For details on troubleshooting please see the home assistant community link below: https://community.home-assistant.io/t/reboot-error-nvme-not-detected/781485

What version of Home Assistant Core has the issue?

core-2024.10.4

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

No response

Link to integration documentation on our website

No response

Diagnostics information

home-assistant_2024-11-03T08-17-41.453Z.log

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

user86000 commented 3 weeks ago

same problem pi5 with ssd hat reboot fails sometimes normally fan runs 2 seconds on boot / reboot, if reboot fails the fan stays on same problem on 2 different installations (original ssd hat an pimoroni ssd hat) with sd card reboot works with the same image, 47W power supply

Janverhu commented 3 weeks ago

Same issue for me on the pi5, tried with 2 different nvme boards:

The issue only occurs after a reboot, if I disconnect/reconnect power it boots up fine.

incrediblehorst commented 3 weeks ago

Same problem for me with a clean install of HA with raspberry pi imager and use of sd card only. Sometimes it works, sometimes the boot loop hits.

dietmar68 commented 3 weeks ago

Same problem here

incrediblehorst commented 3 weeks ago

One thing to add is, that the issue only occurs if I restart the system via webgui. If I reboot it via „login“ shell (port 22222) the raspberry boot up normaly.

jackchoui commented 2 weeks ago

Same issue with rpi 5 and Argon one v3 nvme case

incrediblehorst commented 2 weeks ago

I've done 50x ssh root@IP -p 22222 "reboot" and 50x ssh root@IP -p 22222 "ha host reboot" -> every reboot was succesful and it booted up normally. Only the system reboot functionality in the web gui causes the boot problem sometimes likely every third time

servnas commented 2 weeks ago

Same issue with rpi 5 and Argon neo 5 nvme case 5V 5A, power supply NVMe PM991a 128Gb (P31 1Tb or P41 1Tb or PM9A1 1Tb Cross-testing) USB 2.0 port(2) ZBDongle-E,ZBDongle-P USB 3.0 port(1) KVM-USB

You need to manually power off and turn it back on for a normal boot

Ittosch commented 2 weeks ago

[Hey] there, I’m experiencing the same issue with both SD and SSD storage. Sometimes it works, sometimes it doesn’t.

Attempt 1:

Raspberry Pi 5 (8GB) Official power supply 64GB microSD HA core at least since 2024.10

Attempt 2:

Raspberry Pi 5 (8GB) Official power supply Argon One V3 NVMe case SSD: Samsung 990 Pro HA Core up to date

This issue has been happening with both setups for the past 1-2 months. However, if I unplug the Raspberry Pi 5 or press the power button, the next boot has a very high chance of succeeding.

( IMG_3355 IMG_3315)

user86000 commented 2 weeks ago

I've done 50x ssh root@IP -p 22222 "reboot" and 50x ssh root@IP -p 22222 "ha host reboot" -> every reboot was succesful and it booted up normally. Only the system reboot functionality in the web gui causes the boot problem sometimes likely every third time

The automatic reboot after updates seames to work too, only manuell reboot in Web ui has problems

Janverhu commented 2 weeks ago

I've done 50x ssh root@IP -p 22222 "reboot" and 50x ssh root@IP -p 22222 "ha host reboot" -> every reboot was succesful and it booted up normally. Only the system reboot functionality in the web gui causes the boot problem sometimes likely every third time

The automatic reboot after updates seames to work too, only manuell reboot in Web ui has problems

I first noticed the issue after a nightly reboot from an automation (that uses the hassio.host_reboot action) could be that the Web ui uses that same method of course (idk).

carlostico commented 2 weeks ago

Same problem here any update ? NVME Off sometimes nvme: error 8 Failed to open device: 'nvme'

Dri878 commented 2 weeks ago

I've done 50x ssh root@IP -p 22222 "reboot" and 50x ssh root@IP -p 22222 "ha host reboot" -> every reboot was succesful and it booted up normally. Only the system reboot functionality in the web gui causes the boot problem sometimes likely every third time

I've tried running the "ha host reboot" command in the SSH add on and experienced the same issue. Have you tried this as well in the SSH add on?

incrediblehorst commented 2 weeks ago

I've done 50x ssh root@IP -p 22222 "reboot" and 50x ssh root@IP -p 22222 "ha host reboot" -> every reboot was succesful and it booted up normally. Only the system reboot functionality in the web gui causes the boot problem sometimes likely every third time

I've tried running the "ha host reboot" command in the SSH add on and experienced the same issue. Have you tried this as well in the SSH add on?

Nope, I‘ve done this directly on the host. Maybe the problem occurs only if the reboot is triggered in the docker container

chriopter commented 2 weeks ago

I share the issue as well, every 2-3 reboots initiated from the UI lead to this

image
jonyrh commented 2 weeks ago

I have this problem too. Case Argon Neo 5 NVMe. Connected to USB 2.0 SkyConnect, Bluetooth (CSR8510 A10 0a12:0001) and RTC accumulator. Every 2-3 rebooting Pi5 not see 'nmve'. I try set only BOOT_ORDER=0xf6, but problem not missing. SD card or other not installed, only NVMe. eeprom config:

BOOT_UART=1
WAKE_ON_GPIO=0
POWER_OFF_ON_HALT=1
BOOT_ORDER=0xf416
PCIE_PROBE=1

boot config:

...
dtparam=nvme
dtparam=pciex1_gen=3
...

image image

UPDATE if reboot from terminal ha host reboot - problem is not missing... every 2-3 reboot not see 'nvme'.

user86000 commented 1 week ago

i have seen on both pi5 with the ssd boot problem the status of boot slot b is "bad" problem with rauc? maybe pri wants to reboot in slot b?

image

jonyrh commented 1 week ago

i have seen on both pi5 with the ssd boot problem the status of boot slot b is "bad" problem with rauc? maybe pri wants to reboot in slot b?

IMG_20241113_122100_676.jpg

incrediblehorst commented 1 week ago

I have the same result as @user86000. _board: rpi5-64 boot: A boot_slots: A: state: booted status: good version: "13.2" B: state: inactive status: bad version: null data_disk: SK32G-0x92339f62 update_available: false version: "13.2" versionlatest: "13.2"

jonyrh commented 1 week ago

So, I'm tired, install the native operating system with GUI and deployed proxmox in it, disabled all updates. Then installed the home assistant virtual machine (haos-aarch64.qcow2), restore backup and all the problems disappeared. Now there is native access to ssh, vnc, eeprom and the config.txt file. A million reboots and everything is fine! Now I'm happy, on NVMe work is very fast!

image image ha-hard ha-soft