home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.72k stars 952 forks source link

Update from 11.0 to 11.1 does nothing #2870

Closed BebeMischa closed 2 weeks ago

BebeMischa commented 10 months ago

Describe the issue you are experiencing

Got a message, new version is available. Click install, everything looks like it is installing. After the automatic reboot HA comes back alive with still OS 11.0 installed and a message about new version.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

11.0

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Try installing 11.1 by clicking on install
  2. ...

Anything in the Supervisor logs that might be useful for us?

All green, no errors to be seen

Anything in the Host logs that might be useful for us?

Nope

System information

## System Information

version | core-2023.10.5
-- | --
installation_type | Home Assistant OS
dev | false
hassio | true
docker | true
user | root
virtualenv | false
python_version | 3.11.5
os_name | Linux
os_version | 6.1.56
arch | x86_64
timezone | Europe/Amsterdam
config_dir | /config

<details><summary>Home Assistant Community Store</summary>

GitHub API | ok
-- | --
GitHub Content | ok
GitHub Web | ok
GitHub API Calls Remaining | 5000
Installed Version | 1.33.0
Stage | running
Available Repositories | 1331
Downloaded Repositories | 32

</details>

<details><summary>Home Assistant Cloud</summary>

logged_in | true
-- | --
subscription_expiration | 20 november 2023 om 01:00
relayer_connected | true
relayer_region | eu-central-1
remote_enabled | true
remote_connected | true
alexa_enabled | false
google_enabled | true
remote_server | eu-central-1-10.ui.nabu.casa
certificate_status | ready
can_reach_cert_server | ok
can_reach_cloud_auth | ok
can_reach_cloud | ok

</details>

<details><summary>Home Assistant Supervisor</summary>

host_os | Home Assistant OS 11.0
-- | --
update_channel | stable
supervisor_version | supervisor-2023.10.1
agent_version | 1.6.0
docker_version | 24.0.6
disk_total | 219.4 GB
disk_used | 43.0 GB
healthy | true
supported | true
board | generic-x86-64
supervisor_api | ok
version_api | ok
installed_addons | Z-Wave JS (0.2.1), Mosquitto broker (6.3.1), Terminal & SSH (9.7.1), File editor (5.6.0), VLC (0.2.0), MariaDB (2.6.1), Studio Code Server (5.13.0), Samba Backup (5.2.0), TimescaleDB (3.0.2), DSMR Reader (1.10.2), Music Assistant BETA (2.0.0b73), Samba share (10.0.2), motionEye (0.19.1), Dutch gas prices (2023.10.15.1)

</details>

<details><summary>Dashboards</summary>

dashboards | 2
-- | --
resources | 21
views | 56
mode | storage

</details>

<details><summary>Recorder</summary>

oldest_recorder_run | 26 augustus 2023 om 21:33
-- | --
current_recorder_run | 26 oktober 2023 om 23:04
database_engine | mysql
database_version | 10.6.12

</details>

<details><summary>Sonoff</summary>

version | 3.5.3 (a8c6d45)
-- | --
cloud_online | 27 / 27
local_online | 27 / 27

</details>

<details><summary>Spotify</summary>

api_endpoint_reachable | ok
-- | --

</details>

Additional information

No response

aflecha87 commented 10 months ago

I've got the same exact issue. It eventually went through, was just very slow.

francescopeloi commented 10 months ago

I've waited 20 mins and the app hasn't come up yet... should I be worried or this update takes long? Never took so long, and all other tens of previous updates never had a minimal issue.

francescopeloi commented 10 months ago

After ~1h of waiting I decided to force a restart, and now it came up, and with the new version installed. First time I had this issue, not sure what went wrong as I can't find anything useful in the logs.

agners commented 10 months ago

@BebeMischa After an update attempt, can you share the logs of the previous boot? you can get the logs of the previous boot using the following command in the Web terminal (the -b -1 instructs to return the logs of the previous boot):

ha host logs -b -1 -n 10000 > /config/last-boot.log

@francescopeloi can you share the same log in your current situation (where the previous boot was the one which got stuck)?

francescopeloi commented 10 months ago

Here it is. I ran the updated at ~8 CET, I guess the log timestamp are UTC, so the update starts at ~6am in the logs

Oct 27 05:31:53 homeassistant kernel: audit: type=1334 audit(1698384713.656:637): prog-id=262 op=UNLOAD
Oct 27 05:31:53 homeassistant kernel: audit: type=1334 audit(1698384713.656:638): prog-id=261 op=UNLOAD
Oct 27 05:31:53 homeassistant kernel: audit: type=1334 audit(1698384713.656:639): prog-id=260 op=UNLOAD
Oct 27 06:04:27 homeassistant rauc[455]: input bundle: /mnt/data/supervisor/tmp/hassos-11.1.raucb
Oct 27 06:04:30 homeassistant kernel: loop0: detected capacity change from 0 to 248570
Oct 27 06:04:30 homeassistant kernel: loop0: detected capacity change from 248570 to 248568
Oct 27 06:04:30 homeassistant kernel: loop1: detected capacity change from 0 to 65536
Oct 27 06:05:19 homeassistant systemd[1]: docker-0a62a12670a6263f87b27bf310bd0af82e90c5e8e99bd942760c18fe0390243d.scope: Deactivated successfully.
Oct 27 06:05:19 homeassistant systemd[1]: docker-0a62a12670a6263f87b27bf310bd0af82e90c5e8e99bd942760c18fe0390243d.scope: Consumed 4min 4.909s CPU time.
Oct 27 06:05:19 homeassistant dockerd[556]: time="2023-10-27T06:05:19.649963954Z" level=info msg="ignoring event" container=0a62a12670a6263f87b27bf310bd0af82e90c5e8e99bd942760c18fe0390243d module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 27 06:05:19 homeassistant kernel: docker0: port 1(veth4497a78) entered disabled state
Oct 27 06:05:19 homeassistant kernel: vetha5a98ab: renamed from eth0
Oct 27 06:05:19 homeassistant NetworkManager[450]: <info>  [1698386719.7948] manager: (vetha5a98ab): new Veth device (/org/freedesktop/NetworkManager/Devices/27)
Oct 27 06:05:19 homeassistant kernel: docker0: port 1(veth4497a78) entered disabled state
Oct 27 06:05:19 homeassistant kernel: device veth4497a78 left promiscuous mode
Oct 27 06:05:19 homeassistant kernel: docker0: port 1(veth4497a78) entered disabled state
Oct 27 06:05:19 homeassistant kernel: audit: type=1700 audit(1698386719.821:640): dev=veth4497a78 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Oct 27 06:05:19 homeassistant kernel: audit: type=1300 audit(1698386719.821:640): arch=c00000b7 syscall=206 success=yes exit=32 a0=c a1=4001591f20 a2=20 a3=0 items=0 ppid=1 pid=556 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=unconfined key=(null)
Oct 27 06:05:19 homeassistant kernel: audit: type=1327 audit(1698386719.821:640): proctitle=2F7573722F62696E2F646F636B657264002D480066643A2F2F002D2D636F6E7461696E6572643D2F72756E2F636F6E7461696E6572642F636F6E7461696E6572642E736F636B
Oct 27 06:05:19 homeassistant kernel: hassio: port 2(vethf84b4fe) entered disabled state
Oct 27 06:05:19 homeassistant kernel: veth1a96227: renamed from eth1
Oct 27 06:05:19 homeassistant NetworkManager[450]: <info>  [1698386719.9559] manager: (veth1a96227): new Veth device (/org/freedesktop/NetworkManager/Devices/28)
Oct 27 06:05:19 homeassistant kernel: hassio: port 2(vethf84b4fe) entered disabled state
Oct 27 06:05:19 homeassistant kernel: device vethf84b4fe left promiscuous mode
Oct 27 06:05:19 homeassistant kernel: hassio: port 2(vethf84b4fe) entered disabled state
Oct 27 06:05:19 homeassistant kernel: audit: type=1700 audit(1698386719.969:641): dev=vethf84b4fe prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Oct 27 06:05:20 homeassistant kernel: audit: type=1300 audit(1698386719.969:641): arch=c00000b7 syscall=206 success=yes exit=32 a0=c a1=400138c2c0 a2=20 a3=0 items=0 ppid=1 pid=556 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=unconfined key=(null)
Oct 27 06:05:20 homeassistant kernel: audit: type=1327 audit(1698386719.969:641): proctitle=2F7573722F62696E2F646F636B657264002D480066643A2F2F002D2D636F6E7461696E6572643D2F72756E2F636F6E7461696E6572642F636F6E7461696E6572642E736F636B
Oct 27 06:05:20 homeassistant systemd[1]: run-docker-netns-eb06fb53b0f4.mount: Deactivated successfully.
Oct 27 06:05:20 homeassistant systemd[1]: var-lib-docker-overlay2-9be9ce732c56f6a63d39ec3587aa813d246211e5925c827275dad958b1701739-merged.mount: Deactivated successfully.
Oct 27 06:05:20 homeassistant systemd[1]: mnt-data-docker-overlay2-9be9ce732c56f6a63d39ec3587aa813d246211e5925c827275dad958b1701739-merged.mount: Deactivated successfully.
Oct 27 06:05:20 homeassistant hassos-supervisor[22328]: 0
Oct 27 06:05:20 homeassistant systemd[1]: hassos-supervisor.service: Deactivated successfully.
Oct 27 06:05:20 homeassistant systemd[1]: hassos-supervisor.service: Consumed 2.691s CPU time.
Oct 27 06:05:25 homeassistant systemd[1]: hassos-supervisor.service: Scheduled restart job, restart counter is at 2.
Oct 27 06:05:25 homeassistant systemd[1]: Stopped HassOS supervisor.
Oct 27 06:05:25 homeassistant systemd[1]: hassos-supervisor.service: Consumed 2.691s CPU time.
Oct 27 06:05:25 homeassistant systemd[1]: Starting HassOS supervisor...
Oct 27 06:05:25 homeassistant systemd[1]: Started HassOS supervisor.
Oct 27 06:05:25 homeassistant hassos-supervisor[25419]: [INFO] Starting the Supervisor...
Oct 27 06:05:25 homeassistant systemd[1]: docker-738c6cac3b3600721d9169ff95a87ae5caa61cf2fbb8eaafee722b13d4a6651d.scope: Deactivated successfully.
Oct 27 06:05:25 homeassistant systemd[1]: docker-738c6cac3b3600721d9169ff95a87ae5caa61cf2fbb8eaafee722b13d4a6651d.scope: Consumed 2h 6min 10.164s CPU time.
Oct 27 06:05:26 homeassistant kernel: docker0: port 1(veth31da1f1) entered blocking state
Oct 27 06:05:26 homeassistant kernel: docker0: port 1(veth31da1f1) entered disabled state
Oct 27 06:05:26 homeassistant kernel: device veth31da1f1 entered promiscuous mode
Oct 27 06:05:26 homeassistant kernel: audit: type=1700 audit(1698386726.044:642): dev=veth31da1f1 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Oct 27 06:05:26 homeassistant kernel: audit: type=1300 audit(1698386726.044:642): arch=c00000b7 syscall=206 success=yes exit=40 a0=c a1=4000b916b0 a2=28 a3=0 items=0 ppid=1 pid=556 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=unconfined key=(null)
Oct 27 06:05:26 homeassistant kernel: audit: type=1327 audit(1698386726.044:642): proctitle=2F7573722F62696E2F646F636B657264002D480066643A2F2F002D2D636F6E7461696E6572643D2F72756E2F636F6E7461696E6572642F636F6E7461696E6572642E736F636B
Oct 27 06:05:26 homeassistant kernel: audit: type=1334 audit(1698386726.064:643): prog-id=53 op=UNLOAD
Oct 27 06:05:26 homeassistant dockerd[556]: time="2023-10-27T06:05:26.037420863Z" level=info msg="ignoring event" container=738c6cac3b3600721d9169ff95a87ae5caa61cf2fbb8eaafee722b13d4a6651d module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.0529] manager: (vethfc199ad): new Veth device (/org/freedesktop/NetworkManager/Devices/29)
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.0601] manager: (veth31da1f1): new Veth device (/org/freedesktop/NetworkManager/Devices/30)
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered blocking state
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered disabled state
Oct 27 06:05:26 homeassistant kernel: device veth240f3cd entered promiscuous mode
Oct 27 06:05:26 homeassistant kernel: audit: type=1700 audit(1698386726.116:644): dev=veth240f3cd prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Oct 27 06:05:26 homeassistant kernel: audit: type=1300 audit(1698386726.116:644): arch=c00000b7 syscall=206 success=yes exit=40 a0=c a1=40017a7a70 a2=28 a3=0 items=0 ppid=1 pid=556 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=unconfined key=(null)
Oct 27 06:05:26 homeassistant kernel: audit: type=1327 audit(1698386726.116:644): proctitle=2F7573722F62696E2F646F636B657264002D480066643A2F2F002D2D636F6E7461696E6572643D2F72756E2F636F6E7461696E6572642F636F6E7461696E6572642E736F636B
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered blocking state
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered forwarding state
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.1230] manager: (vethc8a24cc): new Veth device (/org/freedesktop/NetworkManager/Devices/31)
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.1282] manager: (veth240f3cd): new Veth device (/org/freedesktop/NetworkManager/Devices/32)
Oct 27 06:05:26 homeassistant systemd[1]: var-lib-docker-overlay2-f43d5283a8fdd304c082a252263d2bb7bb6ad7ca310ccb9dfc31b0b01b86bec8-merged.mount: Deactivated successfully.
Oct 27 06:05:26 homeassistant systemd[1]: mnt-data-docker-overlay2-f43d5283a8fdd304c082a252263d2bb7bb6ad7ca310ccb9dfc31b0b01b86bec8-merged.mount: Deactivated successfully.
Oct 27 06:05:26 homeassistant systemd[1]: run-docker-runtime\x2drunc-moby-0a62a12670a6263f87b27bf310bd0af82e90c5e8e99bd942760c18fe0390243d-runc.Z00Oc0.mount: Deactivated successfully.
Oct 27 06:05:26 homeassistant systemd[1]: Started libcontainer container 0a62a12670a6263f87b27bf310bd0af82e90c5e8e99bd942760c18fe0390243d.
Oct 27 06:05:26 homeassistant kernel: audit: type=1334 audit(1698386726.444:645): prog-id=263 op=LOAD
Oct 27 06:05:26 homeassistant kernel: audit: type=1300 audit(1698386726.444:645): arch=c00000b7 syscall=280 success=yes exit=15 a0=5 a1=4000135700 a2=78 a3=0 items=0 ppid=25493 pid=25503 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Oct 27 06:05:26 homeassistant kernel: audit: type=1327 audit(1698386726.444:645): proctitle=72756E63002D2D726F6F74002F7661722F72756E2F646F636B65722F72756E74696D652D72756E632F6D6F6279002D2D6C6F67002F72756E2F636F6E7461696E6572642F696F2E636F6E7461696E6572642E72756E74696D652E76322E7461736B2F6D6F62792F30613632613132363730613632363366383762323762663331
Oct 27 06:05:26 homeassistant kernel: eth0: renamed from vethfc199ad
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered disabled state
Oct 27 06:05:26 homeassistant kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth31da1f1: link becomes ready
Oct 27 06:05:26 homeassistant kernel: docker0: port 1(veth31da1f1) entered blocking state
Oct 27 06:05:26 homeassistant kernel: docker0: port 1(veth31da1f1) entered forwarding state
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.6861] device (veth31da1f1): carrier: link connected
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.6869] device (docker0): carrier: link connected
Oct 27 06:05:26 homeassistant kernel: eth1: renamed from vethc8a24cc
Oct 27 06:05:26 homeassistant NetworkManager[450]: <info>  [1698386726.8190] device (veth240f3cd): carrier: link connected
Oct 27 06:05:26 homeassistant kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth240f3cd: link becomes ready
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered blocking state
Oct 27 06:05:26 homeassistant kernel: hassio: port 2(veth240f3cd) entered forwarding state
Oct 27 06:05:26 homeassistant hassos-supervisor[25446]: hassio_supervisor
Oct 27 06:05:30 homeassistant systemd[1]: Starting Hostname Service...
Oct 27 06:05:30 homeassistant systemd[1]: Starting Time & Date Service...
Oct 27 06:05:30 homeassistant systemd[1]: Started Hostname Service.
Oct 27 06:05:30 homeassistant systemd[1]: Started Time & Date Service.
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.6619] audit: op="connection-update" uuid="5d3d5b83-6f50-3e42-b0c5-3892150a3c25" name="Supervisor eth0" args="connection.timestamp" pid=25621 uid=0 result="success"
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.6674] device (end0): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.6683] manager: NetworkManager state is now DISCONNECTING
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.6720] device (end0): disconnecting for new activation request.
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.6722] audit: op="connection-activate" uuid="5d3d5b83-6f50-3e42-b0c5-3892150a3c25" name="Supervisor eth0" pid=25621 uid=0 result="success"
Oct 27 06:05:30 homeassistant systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 27 06:05:30 homeassistant systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7497] device (end0): state change: deactivating -> disconnected (reason 'new-activation', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7755] dhcp4 (end0): canceled DHCP transaction
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7763] dhcp4 (end0): activation: beginning transaction (timeout in 45 seconds)
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7767] dhcp4 (end0): state changed no lease
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7779] dhcp6 (end0): canceled DHCP transaction
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7785] dhcp6 (end0): activation: beginning transaction (timeout in 45 seconds)
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.7790] dhcp6 (end0): state changed no lease
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client reset search domain list.
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set default route setting: no
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set MulticastDNS setting: no
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client reset DNS server list.
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8154] manager: NetworkManager state is now DISCONNECTED
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8168] device (end0): Activation: starting connection 'Supervisor eth0' (5d3d5b83-6f50-3e42-b0c5-3892150a3c25)
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8214] device (end0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8222] manager: NetworkManager state is now CONNECTING
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8228] device (end0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8252] device (end0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8268] dhcp4 (end0): activation: beginning transaction (timeout in 45 seconds)
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8310] dhcp4 (end0): state changed new lease, address=192.168.178.250
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8322] policy: set 'Supervisor eth0' (end0) as default for IPv4 routing and DNS
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set MulticastDNS setting: yes
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set search domain list to: fritz.box
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set default route setting: yes
Oct 27 06:05:30 homeassistant systemd-resolved[448]: end0: Bus client set DNS server list to: 192.168.178.1
Oct 27 06:05:30 homeassistant systemd[1]: Stopping Network Time Synchronization...
Oct 27 06:05:30 homeassistant systemd[1]: systemd-timesyncd.service: Deactivated successfully.
Oct 27 06:05:30 homeassistant systemd[1]: Stopped Network Time Synchronization.
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8666] device (end0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8723] device (end0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8729] device (end0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8749] manager: NetworkManager state is now CONNECTED_SITE
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.8770] device (end0): Activation: successful, device activated.
Oct 27 06:05:30 homeassistant systemd[1]: Starting Network Time Synchronization...
Oct 27 06:05:30 homeassistant NetworkManager[450]: <info>  [1698386730.9336] manager: NetworkManager state is now CONNECTED_GLOBAL
Oct 27 06:05:31 homeassistant os-agent[259]: INFO: 2023/10/27 06:05:31 apparmor.go:45: Load AppArmor profile '/mnt/data/supervisor/apparmor/hassio-supervisor'.
Oct 27 06:05:31 homeassistant os-agent[259]: INFO: 2023/10/27 06:05:31 apparmor.go:53: Load profile '/mnt/data/supervisor/apparmor/hassio-supervisor':
Oct 27 06:05:31 homeassistant systemd[1]: Started Network Time Synchronization.
Oct 27 06:05:31 homeassistant systemd-timesyncd[25683]: Contacted time server 162.159.200.123:123 (time.cloudflare.com).
Oct 27 06:05:31 homeassistant systemd-timesyncd[25683]: Initial clock synchronization to Fri 2023-10-27 06:05:31.265875 UTC.
Oct 27 06:05:31 homeassistant systemd[1]: Stopping Network Time Synchronization...
Oct 27 06:05:31 homeassistant systemd[1]: systemd-timesyncd.service: Deactivated successfully.
Oct 27 06:05:31 homeassistant systemd[1]: Stopped Network Time Synchronization.
Oct 27 06:05:31 homeassistant kernel: kauditd_printk_skb: 46 callbacks suppressed
Oct 27 06:05:31 homeassistant kernel: audit: type=1334 audit(1698386731.324:668): prog-id=272 op=UNLOAD
Oct 27 06:05:31 homeassistant kernel: audit: type=1334 audit(1698386731.328:669): prog-id=273 op=LOAD
Oct 27 06:05:31 homeassistant systemd[1]: Starting Network Time Synchronization...
Oct 27 06:05:31 homeassistant systemd[1]: Started Network Time Synchronization.
Oct 27 06:05:31 homeassistant systemd-timesyncd[25699]: Contacted time server 192.168.178.1:123 (192.168.178.1).
Oct 27 06:05:31 homeassistant systemd-timesyncd[25699]: Initial clock synchronization to Fri 2023-10-27 06:05:31.660150 UTC.
Oct 27 06:05:32 homeassistant NetworkManager[450]: <info>  [1698386732.5612] dhcp6 (end0): activation: beginning transaction (timeout in 45 seconds)
Oct 27 06:05:32 homeassistant NetworkManager[450]: <info>  [1698386732.5659] dhcp6 (end0): state changed new lease
Oct 27 06:05:42 homeassistant systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Oct 27 06:06:02 homeassistant systemd[1]: systemd-timedated.service: Deactivated successfully.
Oct 27 06:06:02 homeassistant systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.728:670): prog-id=271 op=UNLOAD
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.728:671): prog-id=270 op=UNLOAD
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.728:672): prog-id=269 op=UNLOAD
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.804:673): prog-id=268 op=UNLOAD
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.804:674): prog-id=267 op=UNLOAD
Oct 27 06:06:02 homeassistant kernel: audit: type=1334 audit(1698386762.804:675): prog-id=266 op=UNLOAD
Oct 27 06:17:42 homeassistant systemd-journald[136]: Data hash table of /var/log/journal/c891ea21a1624b92b2f47959a86805cc/system.journal has a fill level at 75.0 (85335 of 113777 items, 25165824 file size, 294 bytes per hash table item), suggesting rotation.
Oct 27 06:17:42 homeassistant systemd-journald[136]: /var/log/journal/c891ea21a1624b92b2f47959a86805cc/system.journal: Journal header limits reached or header out-of-date, rotating.
agners commented 10 months ago

Hm, it looks like shutdown somehow failed. First the update is run, then the restart is triggered by the Supervisor, but somehow this did not lead to a full reboot:

Oct 27 06:04:27 homeassistant rauc[455]: input bundle: /mnt/data/supervisor/tmp/hassos-11.1.raucb
...
Oct 27 06:05:20 homeassistant systemd[1]: hassos-supervisor.service: Deactivated successfully.
Oct 27 06:05:20 homeassistant systemd[1]: hassos-supervisor.service: Consumed 2.691s CPU time.
Oct 27 06:05:25 homeassistant systemd[1]: hassos-supervisor.service: Scheduled restart job, restart counter is at 2.
Oct 27 06:05:25 homeassistant systemd[1]: Stopped HassOS supervisor.
Oct 27 06:05:25 homeassistant systemd[1]: hassos-supervisor.service: Consumed 2.691s CPU time.
Oct 27 06:05:25 homeassistant systemd[1]: Starting HassOS supervisor...
Oct 27 06:05:25 homeassistant systemd[1]: Started HassOS supervisor.
Oct 27 06:05:25 homeassistant hassos-supervisor[25419]: [INFO] Starting the Supervisor...
...

Since you seem to be on a virtual machine, can you login to the OS shell and get the complete logs using:

ha > login # < to get to the OS shell

then

journalctl -b -1 > /mnt/data/supervisor/homeassistant/last-boot-full.log

With that you should be able to find the full log in the Home Assistant config folder as last-boot-full.log.

francescopeloi commented 10 months ago

what is the exact command to get the OS shell sorry? It seems that is not doing anything for me. fyi to run commands on the machine, I am using the Terminal & SSH addon.

agners commented 10 months ago

@francescopeloi you can access the OS shell using keyboard/mouse connected to your system. What type of hardware are you using?

The Terminal & SSH add-on doesn't give access to the underlying OS shell, so you can't use that.

francescopeloi commented 10 months ago

I am on a raspberry pi 4, unfortunately I don't have a mini hdmi cable with me to do this. There's no other way I guess?

agners commented 10 months ago

There is: You can access the operating system console using SSH on port 22222. But you have to deploy a public key to the device first via a USB flash drive. The process is documented here.

francescopeloi commented 10 months ago

Magic! It worked! Thanks.

Here's the output of

journalctl -b -1 > /mnt/data/supervisor/homeassistant/last-boot-full.log

last-boot-full.log

agners commented 10 months ago

The log doesn't seem to contain a bootup :confused: I am bit confused why the log is only so short. It seems as if the log entries got purged.

I can't really tell what happened here, I am sorry. But since update ultimately worked, maybe this is ok?

Do reboots generally work? I suggest to take a full backup and download it, in case the SD card/disk has issues.

francescopeloi commented 10 months ago

Isn't the bootup log starting from this line on the above attached file?

Oct 27 06:05:25 homeassistant systemd[1]: hassos-supervisor.service: Scheduled restart job, restart counter is at 2.

Reboots never had an issue.

Anyway everything is working for me and I am taking back ups daily, so all good for me. It was more for you guys if you think you could have a bug somewhere.

agners commented 10 months ago

Isn't the bootup log starting from this line on the above attached file?

Normally there is much more before that, the hole kernel boot sequence (homeassistant kernel ...).

Anyway everything is working for me and I am taking back ups daily, so all good for me. It was more for you guys if you think you could have a bug somewhere.

I do have Raspberry Pi 4 running here, and the restart went through fine after the update for me. I also guess that for most people it went through fine. It would be interesting what type of hickup your system had on restart, but the logs just seem incomplete.

Thanks again!

francescopeloi commented 10 months ago

thank you for your instance on this investigation! I can now also ssh into the host machine, which I did not know I could do :)

BebeMischa commented 10 months ago

So, I did leave it for night and working day alone. Just arrived from work now, and tried to run the update for maybe the 10th time now.

Still no luck...

last-boot.log

regularguy01 commented 10 months ago

I had some weirdness as well. I am on Proxmox and HAOS.

I had an OS 11.1 and a Supervisor update notice.

I clicked on the install 11.1. Gave an error in the browser. Nothing in the GUI happened. So I refreshed cleared the browser cache. clicked again Nothing.

So I thought maybe the Supervisor needed to go first? So clicked install on that it went through and reported installed.

Then I tried the OS again seemed to go through. After this the system seemed really slow all over. So I rebooted the entire VM on proxmox. It came up and reported updated to 11.1. Seems to be fine now.

BebeMischa commented 10 months ago

@agners Can you, please, have a look at my log above here? Looks, like my case is a bit hidden between all the more or less not related posts :-)

agners commented 10 months ago

@BebeMischa hm, there seems to be very little information around that time when the update happens :thinking:

Oct 27 16:54:35 homeassistant rauc[365]: input bundle: /mnt/data/supervisor/tmp/hassos-11.1.raucb
Oct 27 16:54:37 homeassistant kernel: loop0: detected capacity change from 0 to 413818
Oct 27 16:54:37 homeassistant kernel: loop0: detected capacity change from 413818 to 413816
Oct 27 16:54:37 homeassistant kernel: loop1: detected capacity change from 0 to 65536

Could you maybe try to get the logs via SSH on port 22222 as suggested in https://github.com/home-assistant/operating-system/issues/2870#issuecomment-1782715374 as well?

BebeMischa commented 10 months ago

OMG, I just after 3 years of Supervised switched over to HAOS about a month a go, as widely advised on the HA forums. Just because it should give me a hassle free experience, no more self handling the OS level maintenance. Well, that starts good... With Supervised, I was able to SSH into it like a childplay. Now I need to walk around with USB drives to carry over SSH keys? What a smooth change i did myself... From a self managed system without troubles to a locked system failing on me...

Sorry for the rant, but this is, how I feel about it now.

I'm packing stuff for holidays, I have to work tomorrow and Tuesday and finally on Wednesday I have to leave for a month. I'm not sure, if I will manage this in the last two evenings...

Damn....

BebeMischa commented 10 months ago

Well, shoot me...

Out of frustration I did hit the update button for the 11th time and guess, what happened? It updated.

I even don't try to understand... :-)

pedrofnm commented 10 months ago

Yeah I have the same problem, I click to update, it download the update then it goes offline, reboots a few times then it comes online again after 10 mins with the same version (10.5) and with the update (to 11.1) notification. I'm using raspberry pi 4 4g with an old hard drive. Maybe it's because of the hard drive? Anyway I'm upgrading to an SSD soon so I hope it gets fixed at that time.

BebeMischa commented 10 months ago

This gets even crazier now... So I was already on 11.1 . Today I rebooted the server (no any changes made) and it came back on with 11.0 and a update message... What the hell is going on here?

last-boot.log

agners commented 10 months ago

@BebeMischa sorry for the experience. It is the goal to make it hassle free, unfortunately bugs are a reality in any software :cry: Ultimately I'd like to make it easier to get SSH access still, for exactly this reason.

It seems that the detection failed that the system successfully booted. In that case, after three (presumably) failed attempts, the system falls back to the previous OS.

This is done by the Supervisor, once it reaches the startup state. You should see a log entry similar to this in the Supervisor log:

Oct 30 19:43:51 ha-fujitsu hassio_supervisor[407]: 23-10-30 20:43:51 INFO (MainThread) [supervisor.os.manager] Rauc: B - marked slot kernel.1 as good
BebeMischa commented 10 months ago

But how does a system, which did perform the update successfully, stating that in the system info, get downgraded back just by the next restart? I don't get it...

agners commented 10 months ago

Essentially, we install two operating system in parallel (A/B slot). On each upgrade we install into the other slot, set a maker to try booting from that 3 times, and reboot. Once the reboot was successful, we mark that new slot as good, and from that point onwards the system will always use that new slot (-> upgrade successful).

Now, normally, when the system doesn't mark it as good, its really not good (as in, the system crashes during bootup, retries, crashes again, retries, at crashes a third time, thus causing the boot loader to switch back to the previous slot).

The question here is: Why does your system doesn't mark the new slot as good :thinking: That is unclear to me, the Supervisor logs should have information about that.

It is safe to reinstall OS 11.1 (e.g. when you are in 11.0, just hit install again). This will essentially trigger the installation, reboot the system, and try to boot into 11.1 three times again.

BebeMischa commented 10 months ago

Ok, now I get it a bit better. Well, I actually have seen the system rebooting more than 3x indeed on some of the attempts.

So it looks, like I'm stuck in some vicious circle, constantly switching between the two slots. But Supervisor does not complain about anything, all messages in the log are green...

I'm now attempting update for the 13th time :-(

I never had such problems with Debian'/Supervised combo, it did run seamless for 3 years...

BebeMischa commented 10 months ago

So attempt 13 again succesfull, HA is running, sytem info says 11.1 We will see, what it does on next reboot...

agners commented 10 months ago

@BebeMischa can you check the GRUB bootloader files?

ls -la /mnt/boot/EFI/BOOT/
grub-editenv /mnt/boot/EFI/BOOT/grubenv list

After an update, did you see the above mentioned line in the Supervisor log?

kdelios commented 10 months ago

Just for the record I tried a few times to update my test setup Rpi4 8G booting from SSD from 11rc1 to 11.1 and nothing happened. System rebooted with the 11rc1 each time. So I did a host reboot and updated OS via CLI terminal and worked.

marceldeklerk commented 9 months ago

I have the same problem, I click to update, it download the update then it goes offline, it tries to boot three times (on the console i see errors about containerd not starting) after the 4th boot attempt it finally comes online again (took 10 min) with the same version (10.5) and with the update (to 11.1) notification. I'm running HA on an Intel NUC.

ConsolePicture

Attached picture that I took from the console. I see these errors in every boot attempt, except the 4th attempt that is succesfull.

petervk commented 9 months ago

I'm also having this error. I've tried to install 11.1 a few times and after a much longer than normal period of my Home Assistant not being available it finally boots up and is still running 11.0. Any idea of what can be causing this?

matit97 commented 9 months ago

Same problem, also from 10.1 to 11.1

paipaiv commented 9 months ago

Same problem here, im in 10.5 try update to 11.1 but always return to 10.5

marcbernal commented 9 months ago

Hey, same issue trying to update from 10.5 to 11 in the past weeks, and same with 11.1.

marceldeklerk commented 9 months ago

I captured the bootlog by executing below:

journalctl -b -1 > /mnt/data/supervisor/homeassistant/last-boot-full.log In attached file, search for: Failed to start containerd container runtime

last-boot-new2.log

Nov 02 19:51:38 homeassistant systemd[1]: containerd.service: Scheduled restart job, restart counter is at 1.
Nov 02 19:51:38 homeassistant systemd[1]: Stopped containerd container runtime.
Nov 02 19:51:38 homeassistant systemd[1]: Starting containerd container runtime...
Nov 02 19:51:38 homeassistant systemd[1]: containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 02 19:51:38 homeassistant systemd[1]: containerd.service: Failed with result 'exit-code'.
Nov 02 19:51:38 homeassistant systemd[1]: Failed to start containerd container runtime.
agners commented 9 months ago

@marceldeklerk your errors are very curious: containerd is on a read-only file system, as well as it's service files. I have no idea why it suddenly starts with INVALIDARGUMENT errors. The only thing I could think of is a corrupted root file system.

I guess you already tried, but even after a second installation attempt it doesn't start up?

Maybe the disk/SSD has really issues. Probably worth taking a full backup and download.

agners commented 9 months ago

@marceldeklerk there are also this errors, which indicate disk issues:

Nov 01 20:28:59 homeassistant kernel: EXT4-fs (sda8): resizing filesystem from 29126769 to 29126769 blocks
Nov 01 20:28:59 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 32768
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 98304
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 163840
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 229376
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 294912
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 819200
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 884736
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 1605632
Nov 01 20:29:00 homeassistant systemd-growfs[333]: Successfully resized "/mnt/data" to 111.1G bytes (3584 bytes lost due to blocksize).

I recommend replacing your disk.

marceldeklerk commented 9 months ago

Thanks @agners. That doesn't look healthy indeed.

Still strange that so many others here seem to have the same issue with the same upgrade to 11.1, and that the symptoms appear exactly like mine. @matit97 , @paipaiv , @marcbernal: Based on Agners observation, I am right to assume that your bootlog does not contain the following? containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT

marcbernal commented 9 months ago

In my case:

  1. Update starts
  2. Pi restarts like two times and everything looks fine except for:
[FAILED] Failed to start Docker application Container Engine.
See 'systemctl status docker.service' for details.
[DEPEND] Dependency failed for HassOS supervisor.
  1. After that, it restarts again and:

Waiting for the Home Assistant CLI to be ready... [WARN] Home Assistant CLI not starting! Jump into emergency console...



Trying to figure out if I can find more logs with errors.

Regards, 
picard001 commented 9 months ago

Same here, updated a few times but nothing happened. Finally it appears to reboot but never finishes shutting down all services. I forced a reset and it failed to start containers and rebooted a few times. It finally switched the boot option to slot b (not familiar with how that works?) and it started back on 11.0.

marceldeklerk commented 9 months ago

@marceldeklerk there are also this errors, which indicate disk issues:

Nov 01 20:28:59 homeassistant kernel: EXT4-fs (sda8): resizing filesystem from 29126769 to 29126769 blocks
Nov 01 20:28:59 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 32768
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 98304
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 163840
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 229376
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 294912
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 819200
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 884736
Nov 01 20:29:00 homeassistant kernel: EXT4-fs (sda8): Invalid checksum for backup superblock 1605632
Nov 01 20:29:00 homeassistant systemd-growfs[333]: Successfully resized "/mnt/data" to 111.1G bytes (3584 bytes lost due to blocksize).

I recommend replacing your disk.

I've installed a new SSD in my NUC. Restored it from backup, and all is fine. Was able to upgrade to 11.1 without issues. Thanks @agners

pedrofnm commented 9 months ago

Just to give an update of my case, after I migrated my HA to an SSD drive and it got fixed. After restore, it loaded the 11.1 OS. So probably this gets fixed by just make a fresh install of HA and restore from a backup

asayler commented 9 months ago

I'm seeing this issue as well, and have been going back through at least teh 10.X versions (it's not new to 11.x). Every so many reboots, the system will come up on an old version, presumably because it's booting the backup versions isneatd (in my case, I'm on 11.1, but the backup version is still 10.5 since 11.0 and 11.1 both got installed to the same partition). There's not really any rhyme or reason to what triggers the system to fail over to the older version. Each time it does, I re-run the upgrade and few times and eventually it kicks back over to the correct boot partition.

I'm running on a HA Yellow with a CM4 booting directly off an NVMe drive. I've attached boot logs from both several successful boots and several failed ones:

And the supervisor log from the most recent (successful) boot:

And the host log:

Let me know if I can provide anything else.

Is there a way to force the system to reboot to a specific slot? that might be nicer than just having to retry the update until it decides to use the A slot rather than fail over to the B slot.

Sjasie commented 9 months ago

Hi there, I've had this exact same issue for several days, Today i tried installing it through the HomeAssistat CLI with the ha "os update" command and it works as a charm. no more errors, just updating as expected!

BebeMischa commented 9 months ago

Back from holidays, i did all postponed updates yesterday and did not had any problem anymore. Seems solved here.

JaCoLJcL commented 8 months ago

Someone check the new 11.2 HAOS version? I'm stuck on 10.5 with this error :/

hugheaves commented 7 months ago

FWIW, I had the same problem w/ upgrading to 11.1. So I did a clean install of 11.1 some time ago, and now can't upgrade to 11.2. I don't see anything useful in the logs.

asayler commented 7 months ago

I'm still seeing this issue, and it seems to have gotten more persensit on the 11.3 -> 11.4 update attempt. On previous updates, I could generally get it to eventually take by trying it enough times, but I've attempted the 11.3 -> 11.4 update ~10 times thus far with no luck. As far as I can tell, the system isn't experiencing any failed boots. I've watched the entire process via the serial terminal, and it plays out as follows:

  1. run ha os update from serial console (note: I'm booted into boot slot B when this occurs)
  2. System runs the 11.4 update and says ti was successful
  3. A minute or two later, the system reboots.
  4. The reboot again boots into slot B, which is still running 11.3. AFAICT, it never even attempts to boot slot A.

Perhaps there's an error in the update script or rauc system that isn't properly setting which slot to attempt on the next boot. Is there a way I can manually override that?

I've attached the output from journalctl -b -1 > /mnt/data/supervisor/homeassistant/last-boot-full.log. For some reason, the log appears to go back more than the most recent boot (which was 1/14), and the initial timestamps are wrong until the RTC loads up. But you can see the most recent update attempt in the output at the line that starts with Jan 15 05:34:36 ha rauc[483].

last-boot-full.log

I'm running this on an HA Yellow using a Raspberry Pi CM4. I installed HaOS directly to the NVMe drive about 10 months ago using the directions found in Step 9 of https://yellow.home-assistant.io/power-supply/#installing-home-assistant-software-on-kit.

Happy to test things out as needed.

asayler commented 7 months ago

Adding to the above, I poked at my /mnt/daat/rauc.db file as well, it it shows both slots marked as okay. So I'm not sure what it keeps booting into Slot B, despite a successful 11.4 install in Slot A. File attached.

rauc-20240114.db.txt