home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.93k stars 972 forks source link

System not available after update Home Assistant OS 10.1 #2517

Closed bvanbokhoven closed 1 year ago

bvanbokhoven commented 1 year ago

Describe the issue you are experiencing

After updating to Home Assistant 10.1 the system is not available. I'm running Home Assistant OS on a HP T620 (8GB, 120 SSD). On het CLI no errors ar shown. System should be available at https://homeassistant.local:8123. Scanning the device with a portscanner shows that port 8123 is not active (other ports are). Any ideas what's wrong and how to solve this. Previously had the same problem with OS 9.5. OS 10.0 gave no problems.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

10.1

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

after updating te system to OS 10.1 I rebooted the system After startup the system is not available at port 8123 (web interface not available)

Anything in the Supervisor logs that might be useful for us?

-

Anything in the Host logs that might be useful for us?

-

System information

-

Additional information

-

andyp05 commented 1 year ago

Having the same problem on generic intel-64. OS 9.5 works fine, upgraded and an IP is not being assigned anymore. Comes up with: IPv4 address for enp1s0: No address is being assigned. Has a Realtek ethernet card built in. Downgraded and it works again.

bvanbokhoven commented 1 year ago

How can I downgrade the Home Assistant OS from 10.1 to 10.0? I have access to the HA CLI

Connecting to observer everything seems okay

Home Assistant observer

Supervisor: | Connected Supported: | Supported Healthy: | Healthy

Following ports are active: 53, 1883, 1884, 4357, 5355, 5580, 8763

kauthmbt commented 1 year ago

@bvanbokhoven If I am not mistaken this should do the trick: ha os update --version 10.0

bvanbokhoven commented 1 year ago

Looks like the Core is crashing and restarting. The UI is available very short (1 sec) and then the server is not found. I downgraded the HA OS to 10.0 and the problem is still there. Could this issue occur due to an add-on or integration?

CZX6 commented 1 year ago

Glad I found this... I'm having a very similar issue with a new install on a dedicated Intel based Mini PC with a Realtek Ethernet card. I tried installing haos_generic-x86-64-10.1.img.xz with Balena Etcher and it appeared to install correctly, however the GUI would only be available for a few seconds, or minutes, until Core crashed and restarted then the GUI was not available. Tons of Realtek related errors/messages.

Post in HA Community is here.

I reinstalled using 9.5 and everything is working as expected

agners commented 1 year ago

@bvanbokhoven is your system also using a Realtek Ethernet chip?

@CZX6 these are interesting screenshots. Searching for this error messages lead me to: https://lore.kernel.org/netdev/37b1001d-688c-fa35-0d8a-cbbbae5e6fa8@gmail.com/T/

It seems that this is power management related. Can you try disabling ASPM in BIOS, and/or check the sysfs files from the system console?

# ls /sys/class/net/enp2s0/device/link/
# cat /sys/class/net/enp2s0/device/link/*

And disable to see if that helps:

echo 0 > /sys/class/net/enp2s0/device/link/<filename from ls>
agners commented 1 year ago

@andyp05

No address is being assigned. Has a Realtek ethernet card built in. Downgraded and it works again.

What is the card name of on HAOS 9.5? Is it enp1s0 too?

andyp05 commented 1 year ago

Not sure. I won't have the time to downgrade until this weekend. Since no internet I have to etcher a new os on after I repartition. I will let you know.

CZX6 commented 1 year ago

@agners Kindly appreciate your response!

After downgrading to 9.5 I have not had any issues, or messages like what I posted when running 10 & 10.1. Maybe a kernel related issue with 10+? Referring to notes for the OS10 release - "...Linux kernel 6.1. This means that the major version of most packages got updated."

I'm happy to do some testing and report back, however if this was a hardware power management related issue, would that not be present in 9.5 as well?

I should have some time this evening to Etcher 10.1 back on to the dedicated PC and tweak some settings to see if that resolves the issues. I have some concerns with increased CPU usage with OS10 mentioned in some other posts as well, though. As I'm experiencing this with my VirtualBox VM instance.

Will report back soon, and thanks again for your reply.

andyp05 commented 1 year ago

Did the downgrade to 9.5. It now says: IPv4 addresses of enp1s0: 192.168.1.205/24

CZX6 commented 1 year ago

@agners

Upgraded from 9.5 (which was stable) to OS10.1 and immediately started receiving the errors again

I do not have an ASPM option available in the AMI BIOS on the Geekom Mini PC

ls returned 6 files clkpm l1_1_aspm l1_1_pcipm l1_2_aspm l1_2_pcipm l1_aspm

cat returned the value of the files, respectively 1 1 1 0 1 1

I ran the echo 0 > on all of the files and the errors stopped flowing on the screen. However, when I did a ha> host restart, the values of those files returned to the original state after reboot and Home Assistant GUI was available for about 30 seconds, then was unresponsive... Errors were rapidly scrolling on the screen

I ran the echo 0 > on the 6 files again, and Home Assistant GUI is now available again, and the errors stopped.

Why is this an issue in OS10+ and not in 9.5? Again, assuming the kernel updates to 6.1 broke something with the Realtek drivers?

Is there a way to make these changes permanent so if I reboot again I won't have to go to the console to make the changes every time? I'm concerned about the stability.

Looking for a solid fix if you are aware of one :-)

Screenshots in the HA Community post.

Again, thank you so much for the help and I hope this helps others in the same boat.

StSaens commented 1 year ago

For other t620 users. I had no issues (so far) updating home assistant os from 9.5 to 10.1

bvanbokhoven commented 1 year ago

Found the issue. M.2 SSD card had bad block causing several unexpected errors. Card replace... issues solved.

PlayFaster commented 1 year ago

Just in case anyone else is experiencing this issue and (like myself) has spent some time searching, there are three (3) issues, all very similar on this matter at present (Home Assistant OS ver 10x not able to obtain an IP address, particularly impacting NiPro NUCs with Realtek Ethernet.

See #2545
See #2630 . Open as of now, I've posted by experiences there, hoping it may help to progress this issue.

Thanks