home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.8k stars 959 forks source link

CPU usage bumped from 2% to 10% after Operating System 10.0 upgrade (caused by containerd) #2476

Closed elmr91 closed 11 months ago

elmr91 commented 1 year ago

Describe the issue you are experiencing

I have juste upgraded my proxmox HAOS VM to OS 10 I immediately noticed CPU usage raising from around 2% to 10% after upgrade.

"docker stats" shows a normal container usage / nearly no load.

CONTAINER ID   NAME                      CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O        PIDS
4448c00d8d7e   addon_core_configurator   0.03%     30.43MiB / 1.925GiB   1.54%     892kB / 843B      23.9MB / 307kB   8
01cb0b8343ce   addon_core_ssh            0.00%     27.51MiB / 1.925GiB   1.40%     943kB / 31.7kB    25.4MB / 393kB   12
7920e2a0c17a   hassio_multicast          0.20%     848KiB / 1.925GiB     0.04%     0B / 0B           61.4kB / 102kB   4
f7aefac69874   hassio_audio              0.00%     23.89MiB / 1.925GiB   1.21%     911kB / 0B        21.3MB / 315kB   13
1d6b8aefd115   hassio_dns                0.00%     29.41MiB / 1.925GiB   1.49%     935kB / 26.7kB    24.8MB / 106kB   11
a4872c2f1be6   hassio_cli                0.00%     13.92MiB / 1.925GiB   0.71%     917kB / 4.07kB    13.2MB / 283kB   9
04f375c2f3cd   hassio_supervisor         0.00%     106.5MiB / 1.925GiB   5.40%     1.31MB / 1.07MB   61.5MB / 958kB   24
13b89612a676   homeassistant             0.80%     354.9MiB / 1.925GiB   18.00%    0B / 0B           203MB / 116MB    33
47dda87b99d2   hassio_observer           0.00%     13.68MiB / 1.925GiB   0.69%     927kB / 9.84kB    12MB / 106kB     8 

"top" shows containerd is using a consistent 6-8% CPU (this is the only process using significant CPU load)

   349 root      20   0 1394.2m  43.1m   7.2   2.2   5:04.68 S  `- /usr/bin/containerd

I rebooted the VM, but CPU load stays the same:
2023-04-18 18_12_20-Clipboard

What operating system image do you use?

ova (for Virtual Machines)

What version of Home Assistant Operating System is installed?

10

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

1.Install 9.5 ova image in proxmox 2.Upgrade to Operating System 10 3. ...

Anything in the Supervisor logs that might be useful for us?

23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.hostname] Load dbus interface org.freedesktop.hostname1
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.logind] Load dbus interface org.freedesktop.login1
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.network] Load dbus interface org.freedesktop.NetworkManager
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.rauc] Load dbus interface de.pengutronix.rauc
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.resolved] Load dbus interface org.freedesktop.resolve1
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.systemd] Load dbus interface org.freedesktop.systemd1
23-04-18 18:22:24 INFO (MainThread) [supervisor.dbus.timedate] Load dbus interface org.freedesktop.timedate1
23-04-18 18:22:24 INFO (MainThread) [supervisor.host.services] Updating service information
23-04-18 18:22:24 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
23-04-18 18:22:24 INFO (MainThread) [supervisor.host.network] Updating local network information
23-04-18 18:22:24 INFO (MainThread) [supervisor.host.apparmor] Loading AppArmor Profiles: {'hassio-supervisor'}
23-04-18 18:22:24 INFO (MainThread) [supervisor.docker.monitor] Started docker events monitor
23-04-18 18:22:24 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-cli with version 2022.11.0
23-04-18 18:22:24 INFO (MainThread) [supervisor.plugins.cli] Starting CLI plugin
23-04-18 18:22:24 INFO (SyncWorker_1) [supervisor.docker.interface] Cleaning hassio_cli application
23-04-18 18:22:25 INFO (SyncWorker_1) [supervisor.docker.cli] Starting CLI ghcr.io/home-assistant/amd64-hassio-cli with version 2022.11.0 - 172.30.32.5
23-04-18 18:22:25 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-dns with version 2022.04.1
23-04-18 18:22:25 INFO (MainThread) [supervisor.plugins.dns] Starting CoreDNS plugin
23-04-18 18:22:25 INFO (SyncWorker_0) [supervisor.docker.interface] Cleaning hassio_dns application
23-04-18 18:22:25 INFO (SyncWorker_0) [supervisor.docker.dns] Starting DNS ghcr.io/home-assistant/amd64-hassio-dns with version 2022.04.1 - 172.30.32.3
23-04-18 18:22:25 INFO (MainThread) [supervisor.plugins.dns] Updated /etc/resolv.conf
23-04-18 18:22:25 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-audio with version 2022.07.0
23-04-18 18:22:25 INFO (MainThread) [supervisor.plugins.audio] Starting Audio plugin
23-04-18 18:22:25 INFO (SyncWorker_1) [supervisor.docker.interface] Cleaning hassio_audio application
23-04-18 18:22:26 INFO (SyncWorker_1) [supervisor.docker.audio] Starting Audio ghcr.io/home-assistant/amd64-hassio-audio with version 2022.07.0 - 172.30.32.4
23-04-18 18:22:26 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-observer with version 2021.10.0
23-04-18 18:22:26 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-multicast with version 2022.02.0
23-04-18 18:22:26 INFO (MainThread) [supervisor.plugins.multicast] Starting Multicast plugin
23-04-18 18:22:26 INFO (SyncWorker_1) [supervisor.docker.interface] Cleaning hassio_multicast application
23-04-18 18:22:26 INFO (SyncWorker_1) [supervisor.docker.multicast] Starting Multicast ghcr.io/home-assistant/amd64-hassio-multicast with version 2022.02.0 - Host
23-04-18 18:22:26 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
23-04-18 18:22:26 INFO (MainThread) [supervisor.homeassistant.secrets] Loaded 1 Home Assistant secrets
23-04-18 18:22:26 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/qemux86-64-homeassistant with version 2023.4.5
23-04-18 18:22:26 INFO (MainThread) [supervisor.os.manager] Detect Home Assistant Operating System 10.0 / BootSlot B
23-04-18 18:22:27 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/5c53de3b repository
23-04-18 18:22:27 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/core repository
23-04-18 18:22:27 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/a0d7b954 repository
23-04-18 18:22:27 INFO (MainThread) [supervisor.store] Loading add-ons from store: 69 all - 69 new - 0 remove
23-04-18 18:22:27 INFO (MainThread) [supervisor.addons] Found 2 installed add-ons
23-04-18 18:22:27 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-ssh with version 9.6.1
23-04-18 18:22:27 INFO (SyncWorker_2) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-configurator with version 5.5.0
23-04-18 18:22:27 INFO (MainThread) [supervisor.backups.manager] Found 5 backup files
23-04-18 18:22:27 INFO (MainThread) [supervisor.discovery] Loaded 0 messages
23-04-18 18:22:27 INFO (MainThread) [supervisor.ingress] Loaded 0 ingress sessions
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.SETUP
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.check] System checks complete
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.SETUP
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
23-04-18 18:22:27 INFO (MainThread) [supervisor.jobs] 'ResolutionFixup.run_autofix' blocked from execution, system is not running - CoreState.SETUP
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.SETUP
23-04-18 18:22:27 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
23-04-18 18:22:27 INFO (MainThread) [__main__] Running Supervisor
23-04-18 18:22:27 INFO (MainThread) [supervisor.os.manager] Rauc: B - marked slot kernel.1 as good
23-04-18 18:22:27 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.INITIALIZE' starting 0 add-ons
23-04-18 18:22:27 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.SYSTEM' starting 0 add-ons
23-04-18 18:22:27 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.SERVICES' starting 1 add-ons
23-04-18 18:22:27 INFO (SyncWorker_2) [supervisor.docker.interface] Cleaning addon_core_ssh application
23-04-18 18:22:28 INFO (SyncWorker_2) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-ssh with version 9.6.1
23-04-18 18:22:33 INFO (MainThread) [supervisor.core] Start Home Assistant Core
23-04-18 18:22:33 INFO (SyncWorker_3) [supervisor.docker.interface] Starting homeassistant
23-04-18 18:22:33 INFO (MainThread) [supervisor.homeassistant.core] Wait until Home Assistant is ready
23-04-18 18:22:36 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.STARTUP
23-04-18 18:22:36 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
23-04-18 18:22:37 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/home-assistant/addons repository
23-04-18 18:22:37 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/hassio-addons/repository repository
23-04-18 18:22:37 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/esphome/home-assistant-addon repository
23-04-18 18:22:38 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
23-04-18 18:22:38 INFO (MainThread) [supervisor.store] Loading add-ons from store: 69 all - 0 new - 0 remove
23-04-18 18:22:38 INFO (MainThread) [supervisor.store] Loading add-ons from store: 69 all - 0 new - 0 remove
23-04-18 18:23:09 INFO (MainThread) [supervisor.homeassistant.core] Detect a running Home Assistant instance
23-04-18 18:23:09 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.APPLICATION' starting 1 add-ons
23-04-18 18:23:09 INFO (SyncWorker_0) [supervisor.docker.interface] Cleaning addon_core_configurator application
23-04-18 18:23:09 INFO (SyncWorker_0) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-configurator with version 5.5.0
23-04-18 18:23:14 INFO (MainThread) [supervisor.misc.tasks] All core tasks are scheduled
23-04-18 18:23:14 INFO (MainThread) [supervisor.core] Supervisor is up and running
23-04-18 18:23:14 INFO (MainThread) [supervisor.host.info] Updating local host information
23-04-18 18:23:14 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_IPV6_ERROR/ContextType.DNS_SERVER
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_FAILED/ContextType.DNS_SERVER
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.MULTIPLE_DATA_DISKS/ContextType.SYSTEM
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.TRUST/ContextType.SUPERVISOR
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.NO_CURRENT_BACKUP/ContextType.SYSTEM
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.module] Create new suggestion SuggestionType.CREATE_FULL_BACKUP - ContextType.SYSTEM / None
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.module] Create new issue IssueType.NO_CURRENT_BACKUP - ContextType.SYSTEM / None
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.IPV4_CONNECTION_PROBLEM/ContextType.SYSTEM
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.check] System checks complete
23-04-18 18:23:14 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
23-04-18 18:23:15 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
23-04-18 18:23:15 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
23-04-18 18:23:15 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
23-04-18 18:23:15 INFO (MainThread) [supervisor.host.services] Updating service information
23-04-18 18:23:15 INFO (MainThread) [supervisor.host.network] Updating local network information
23-04-18 18:23:15 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
23-04-18 18:23:15 INFO (MainThread) [supervisor.host.manager] Host information reload completed

Anything in the Host logs that might be useful for us?

Apr 18 16:22:27 homeassistant bluetoothd[2232]: Bluetooth daemon 5.66
Apr 18 16:22:27 homeassistant systemd[1]: Started Bluetooth service.
Apr 18 16:22:27 homeassistant bluetoothd[2232]: Starting SDP server
Apr 18 16:22:27 homeassistant kernel: Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Apr 18 16:22:27 homeassistant kernel: Bluetooth: BNEP filters: protocol multicast
Apr 18 16:22:27 homeassistant kernel: Bluetooth: BNEP socket layer initialized
Apr 18 16:22:27 homeassistant bluetoothd[2232]: Bluetooth management interface 1.22 initialized
Apr 18 16:22:27 homeassistant os-agent[104]: INFO: 2023/04/18 16:22:27 main.go:94: Diagnostics is now true
Apr 18 16:22:27 homeassistant systemd[1]: var-lib-docker-overlay2-2eaf0ad15fa660ee1b2604082b0b49856c12481a72f486c60fa2a9d809f4a7ed\x2dinit-merged.mount: Deactivated successfully.
Apr 18 16:22:27 homeassistant systemd[1]: mnt-data-docker-overlay2-2eaf0ad15fa660ee1b2604082b0b49856c12481a72f486c60fa2a9d809f4a7ed\x2dinit-merged.mount: Deactivated successfully.
Apr 18 16:22:27 homeassistant systemd[1]: var-lib-docker-overlay2-2eaf0ad15fa660ee1b2604082b0b49856c12481a72f486c60fa2a9d809f4a7ed-merged.mount: Deactivated successfully.
Apr 18 16:22:27 homeassistant systemd[1]: mnt-data-docker-overlay2-2eaf0ad15fa660ee1b2604082b0b49856c12481a72f486c60fa2a9d809f4a7ed-merged.mount: Deactivated successfully.
Apr 18 16:22:27 homeassistant kernel: hassio: port 6(vethf8c4077) entered blocking state
Apr 18 16:22:27 homeassistant kernel: hassio: port 6(vethf8c4077) entered disabled state
Apr 18 16:22:27 homeassistant kernel: device vethf8c4077 entered promiscuous mode
Apr 18 16:22:27 homeassistant NetworkManager[319]: <info>  [1681834947.8072] manager: (vethba72fe7): new Veth device (/org/freedesktop/NetworkManager/Devices/17)
Apr 18 16:22:27 homeassistant NetworkManager[319]: <info>  [1681834947.8085] manager: (vethf8c4077): new Veth device (/org/freedesktop/NetworkManager/Devices/18)
Apr 18 16:22:27 homeassistant systemd[1]: Started libcontainer container 585c29bcc7be293d23741540dbb4cc2af1bd3a22b3685bc4f02f04248e9b9f09.
Apr 18 16:22:28 homeassistant kernel: eth0: renamed from vethba72fe7
Apr 18 16:22:28 homeassistant kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethf8c4077: link becomes ready
Apr 18 16:22:28 homeassistant kernel: hassio: port 6(vethf8c4077) entered blocking state
Apr 18 16:22:28 homeassistant kernel: hassio: port 6(vethf8c4077) entered forwarding state
Apr 18 16:22:28 homeassistant NetworkManager[319]: <info>  [1681834948.0849] device (vethf8c4077): carrier: link connected
Apr 18 16:22:33 homeassistant systemd[1]: run-docker-runtime\x2drunc-moby-13b89612a676908649a9c53efe151781e6267309bfde99934415abd8ec969c83-runc.mHMj5m.mount: Deactivated successfully.
Apr 18 16:22:33 homeassistant systemd[1]: Started libcontainer container 13b89612a676908649a9c53efe151781e6267309bfde99934415abd8ec969c83.
Apr 18 16:22:33 homeassistant kernel: kauditd_printk_skb: 128 callbacks suppressed
Apr 18 16:22:33 homeassistant kernel: audit: type=1334 audit(1681834953.354:193): prog-id=49 op=LOAD
Apr 18 16:22:33 homeassistant kernel: audit: type=1300 audit(1681834953.354:193): arch=c000003e syscall=321 success=yes exit=15 a0=5 a1=c00018d7f8 a2=78 a3=0 items=0 ppid=2520 pid=2531 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Apr 18 16:22:33 homeassistant kernel: audit: type=1327 audit(1681834953.354:193): proctitle=72756E63002D2D726F6F74002F7661722F72756E2F646F636B65722F72756E74696D652D72756E632F6D6F6279002D2D6C6F67002F72756E2F636F6E7461696E6572642F696F2E636F6E7461696E6572642E72756E74696D652E76322E7461736B2F6D6F62792F31336238393631326136373639303836343961396335336566
Apr 18 16:22:33 homeassistant kernel: audit: type=1334 audit(1681834953.355:194): prog-id=50 op=LOAD
Apr 18 16:22:33 homeassistant kernel: audit: type=1300 audit(1681834953.355:194): arch=c000003e syscall=321 success=yes exit=17 a0=5 a1=c00018d590 a2=78 a3=0 items=0 ppid=2520 pid=2531 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Apr 18 16:22:33 homeassistant kernel: audit: type=1327 audit(1681834953.355:194): proctitle=72756E63002D2D726F6F74002F7661722F72756E2F646F636B65722F72756E74696D652D72756E632F6D6F6279002D2D6C6F67002F72756E2F636F6E7461696E6572642F696F2E636F6E7461696E6572642E72756E74696D652E76322E7461736B2F6D6F62792F31336238393631326136373639303836343961396335336566
Apr 18 16:22:33 homeassistant kernel: audit: type=1334 audit(1681834953.355:195): prog-id=50 op=UNLOAD
Apr 18 16:22:33 homeassistant kernel: audit: type=1334 audit(1681834953.356:196): prog-id=49 op=UNLOAD
Apr 18 16:22:33 homeassistant kernel: audit: type=1334 audit(1681834953.356:197): prog-id=51 op=LOAD
Apr 18 16:22:33 homeassistant kernel: audit: type=1300 audit(1681834953.356:197): arch=c000003e syscall=321 success=yes exit=15 a0=5 a1=c00018da50 a2=78 a3=0 items=0 ppid=2520 pid=2531 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Apr 18 16:22:37 homeassistant systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Apr 18 16:22:54 homeassistant systemd[1]: systemd-timedated.service: Deactivated successfully.
Apr 18 16:22:54 homeassistant kernel: kauditd_printk_skb: 1 callbacks suppressed
Apr 18 16:22:54 homeassistant kernel: audit: type=1334 audit(1681834974.177:198): prog-id=26 op=UNLOAD
Apr 18 16:22:54 homeassistant kernel: audit: type=1334 audit(1681834974.177:199): prog-id=25 op=UNLOAD
Apr 18 16:22:54 homeassistant kernel: audit: type=1334 audit(1681834974.177:200): prog-id=24 op=UNLOAD
Apr 18 16:22:57 homeassistant systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Apr 18 16:22:57 homeassistant kernel: audit: type=1334 audit(1681834977.264:201): prog-id=14 op=UNLOAD
Apr 18 16:22:57 homeassistant kernel: audit: type=1334 audit(1681834977.264:202): prog-id=13 op=UNLOAD
Apr 18 16:22:57 homeassistant kernel: audit: type=1334 audit(1681834977.264:203): prog-id=12 op=UNLOAD
Apr 18 16:23:09 homeassistant systemd[1]: var-lib-docker-overlay2-7d815f3ad31630b9e27b12657d4e783170bec95b305873ed51056f3844933e1f\x2dinit-merged.mount: Deactivated successfully.
Apr 18 16:23:09 homeassistant systemd[1]: mnt-data-docker-overlay2-7d815f3ad31630b9e27b12657d4e783170bec95b305873ed51056f3844933e1f\x2dinit-merged.mount: Deactivated successfully.
Apr 18 16:23:09 homeassistant systemd[1]: var-lib-docker-overlay2-7d815f3ad31630b9e27b12657d4e783170bec95b305873ed51056f3844933e1f-merged.mount: Deactivated successfully.
Apr 18 16:23:09 homeassistant systemd[1]: mnt-data-docker-overlay2-7d815f3ad31630b9e27b12657d4e783170bec95b305873ed51056f3844933e1f-merged.mount: Deactivated successfully.
Apr 18 16:23:09 homeassistant NetworkManager[319]: <info>  [1681834989.5172] manager: (vethaf7f565): new Veth device (/org/freedesktop/NetworkManager/Devices/19)
Apr 18 16:23:09 homeassistant kernel: hassio: port 7(vethe13a13f) entered blocking state
Apr 18 16:23:09 homeassistant kernel: hassio: port 7(vethe13a13f) entered disabled state
Apr 18 16:23:09 homeassistant kernel: device vethe13a13f entered promiscuous mode
Apr 18 16:23:09 homeassistant kernel: audit: type=1700 audit(1681834989.517:204): dev=vethe13a13f prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Apr 18 16:23:09 homeassistant kernel: audit: type=1300 audit(1681834989.517:204): arch=c000003e syscall=44 success=yes exit=40 a0=c a1=c000a38690 a2=28 a3=0 items=0 ppid=1 pid=397 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=unconfined key=(null)
Apr 18 16:23:09 homeassistant kernel: audit: type=1327 audit(1681834989.517:204): proctitle=2F7573722F62696E2F646F636B657264002D480066643A2F2F002D2D636F6E7461696E6572643D2F72756E2F636F6E7461696E6572642F636F6E7461696E6572642E736F636B
Apr 18 16:23:09 homeassistant NetworkManager[319]: <info>  [1681834989.5216] manager: (vethe13a13f): new Veth device (/org/freedesktop/NetworkManager/Devices/20)
Apr 18 16:23:09 homeassistant systemd[1]: Started libcontainer container 481308628ffcd074969046225f8a9435aea56255ad7b6d1eb690b2e3f270004e.
Apr 18 16:23:09 homeassistant kernel: audit: type=1334 audit(1681834989.625:205): prog-id=52 op=LOAD
Apr 18 16:23:09 homeassistant kernel: audit: type=1334 audit(1681834989.625:206): prog-id=53 op=LOAD
Apr 18 16:23:09 homeassistant kernel: audit: type=1300 audit(1681834989.625:206): arch=c000003e syscall=321 success=yes exit=16 a0=5 a1=c00018d7f8 a2=78 a3=0 items=0 ppid=2881 pid=2890 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Apr 18 16:23:09 homeassistant kernel: audit: type=1327 audit(1681834989.625:206): proctitle=72756E63002D2D726F6F74002F7661722F72756E2F646F636B65722F72756E74696D652D72756E632F6D6F6279002D2D6C6F67002F72756E2F636F6E7461696E6572642F696F2E636F6E7461696E6572642E72756E74696D652E76322E7461736B2F6D6F62792F34383133303836323866666364303734393639303436323235
Apr 18 16:23:09 homeassistant kernel: audit: type=1334 audit(1681834989.625:207): prog-id=54 op=LOAD
Apr 18 16:23:09 homeassistant kernel: audit: type=1300 audit(1681834989.625:207): arch=c000003e syscall=321 success=yes exit=18 a0=5 a1=c00018d590 a2=78 a3=0 items=0 ppid=2881 pid=2890 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runc" exe="/usr/bin/runc" subj=unconfined key=(null)
Apr 18 16:23:09 homeassistant kernel: audit: type=1327 audit(1681834989.625:207): proctitle=72756E63002D2D726F6F74002F7661722F72756E2F646F636B65722F72756E74696D652D72756E632F6D6F6279002D2D6C6F67002F72756E2F636F6E7461696E6572642F696F2E636F6E7461696E6572642E72756E74696D652E76322E7461736B2F6D6F62792F34383133303836323866666364303734393639303436323235
Apr 18 16:23:09 homeassistant kernel: eth0: renamed from vethaf7f565
Apr 18 16:23:09 homeassistant kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethe13a13f: link becomes ready
Apr 18 16:23:09 homeassistant kernel: hassio: port 7(vethe13a13f) entered blocking state
Apr 18 16:23:09 homeassistant kernel: hassio: port 7(vethe13a13f) entered forwarding state
Apr 18 16:23:09 homeassistant NetworkManager[319]: <info>  [1681834989.7906] device (vethe13a13f): carrier: link connected
Apr 18 16:23:14 homeassistant kernel: kauditd_printk_skb: 29 callbacks suppressed
Apr 18 16:23:14 homeassistant kernel: audit: type=1334 audit(1681834994.860:219): prog-id=56 op=LOAD
Apr 18 16:23:14 homeassistant kernel: audit: type=1334 audit(1681834994.861:220): prog-id=57 op=LOAD
Apr 18 16:23:14 homeassistant kernel: audit: type=1334 audit(1681834994.861:221): prog-id=58 op=LOAD
Apr 18 16:23:14 homeassistant systemd[1]: Starting Hostname Service...
Apr 18 16:23:14 homeassistant systemd[1]: Started Hostname Service.
Apr 18 16:23:15 homeassistant kernel: audit: type=1334 audit(1681834995.000:222): prog-id=59 op=LOAD
Apr 18 16:23:15 homeassistant kernel: audit: type=1334 audit(1681834995.001:223): prog-id=60 op=LOAD
Apr 18 16:23:15 homeassistant kernel: audit: type=1334 audit(1681834995.001:224): prog-id=61 op=LOAD
Apr 18 16:23:15 homeassistant systemd[1]: Starting Time & Date Service...
Apr 18 16:23:15 homeassistant systemd[1]: Started Time & Date Service.
Apr 18 16:23:45 homeassistant systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.038:225): prog-id=58 op=UNLOAD
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.038:226): prog-id=57 op=UNLOAD
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.038:227): prog-id=56 op=UNLOAD
Apr 18 16:23:45 homeassistant systemd[1]: systemd-timedated.service: Deactivated successfully.
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.161:228): prog-id=61 op=UNLOAD
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.161:229): prog-id=60 op=UNLOAD
Apr 18 16:23:45 homeassistant kernel: audit: type=1334 audit(1681835025.161:230): prog-id=59 op=UNLOAD
Apr 18 16:32:42 homeassistant dropbear[3091]: [3091] Apr 18 16:32:42 Child connection from 192.168.0.12:50710
Apr 18 16:32:42 homeassistant dropbear[3091]: [3091] Apr 18 16:32:42 Pubkey auth succeeded for 'root' with ssh-rsa key SHA256:5KG9InxtuSAC5RtRQ/pdqJU5u4duuGFcXq+KRToAnQc from 192.168.0.12:50710
Apr 18 16:37:22 homeassistant systemd[1]: Starting Cleanup of Temporary Directories...
Apr 18 16:37:22 homeassistant systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Apr 18 16:37:22 homeassistant systemd[1]: Finished Cleanup of Temporary Directories.
Apr 18 16:37:22 homeassistant systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Apr 18 16:37:38 homeassistant kernel: audit: type=1334 audit(1681835858.696:231): prog-id=62 op=LOAD
Apr 18 16:37:38 homeassistant systemd-timesyncd[1197]: Network configuration changed, trying to establish connection.
Apr 18 16:37:38 homeassistant systemd-timesyncd[1197]: Contacted time server 192.168.0.254:123 (192.168.0.254).
Apr 18 16:37:38 homeassistant systemd[1]: Started Journal Gateway Service.
Apr 18 16:37:38 homeassistant systemd-journal-gatewayd[3113]: microhttpd: MHD_OPTION_EXTERNAL_LOGGER is not the first option specified for the daemon. Some messages may be printed by the standard MHD logger.

System information

`## System Information

version core-2023.4.5
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.10.10
os_name Linux
os_version 6.1.24
arch x86_64
timezone Europe/Paris
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 4975 Installed Version | 1.32.1 Stage | running Available Repositories | 1274 Downloaded Repositories | 3
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 10.0 -- | -- update_channel | stable supervisor_version | supervisor-2023.04.0 agent_version | 1.5.1 docker_version | 23.0.3 disk_total | 30.8 GB disk_used | 3.9 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Terminal & SSH (9.6.1), File editor (5.5.0)
Dashboards dashboards | 2 -- | -- resources | 1 views | 5 mode | storage
Recorder oldest_recorder_run | 12 avril 2023 à 19:39 -- | -- current_recorder_run | 18 avril 2023 à 18:22 estimated_db_size | 179.41 MiB database_engine | sqlite database_version | 3.38.5 [
`](url) ### Additional information _No response_
gfn256 commented 1 year ago

Not only does problem persist in 10.1, but I see an increased CPU activity compared with even 10.0 of +/- 2% and a similar average temp increase. This needs addressing.

TheHolyRoger commented 1 year ago

Certainly sir, would you like a caesar, a balsamic or a lemon and herb dressing?

agners commented 1 year ago

FWIW, it is unlikely that 10.1 addresses this as there is no Docker update. It seems that this is related to containerd.

What would be interesting is if a Supervised installation with Debian 11 using Docker 23.0 (compared to Docker 20.10) sees the same increase.

The latest dev build upgrades to Docker 23.0.5, if someone could test that would be interesting:

https://os-builds.home-assistant.io/11.0.dev20230427/

Or to update your system directly (please create and download a backup, since this updates to development builds):

ha su options --channel=dev
ha su reload
ha os update
ha su options --channel=stable
ArtJames commented 1 year ago

Or to update your system directly (please create and download a backup, since this updates to development builds):

Proxmox VE latest, J4105, VM: 2 cores/4GB 9.5 ~2.1% 11 dev ~6.8% Screenshot 2023-04-28 at 23-46-49 pve0 - Proxmox Virtual Environment

Dragonfir3 commented 1 year ago

High CPU consumption with version 10 and 10.1 (containerd: 6-7% all time). Downgraded to 9.5. Captura de pantalla 2023-04-30 a las 14 37 08

Christophe999s commented 1 year ago

Add me to the list of affected users. Downgraded to 9.5 and cpu usage went back to normal. Also running on pve 7.4-3

lyricnz commented 1 year ago

This issue doesn't seem to be getting much traction. How can we help? (as competent sysadmin/python devs, but not especially familiar with HAOS)

agners commented 1 year ago

In a quick attempt last week I monitored containerd using strace to see if there are significant amount of syscalls going on. If that would be the case, it could also be Linux kernel issue. However, I didn't see any syscalls over longer period of time, while there is CPU usage. So I am assuming that the problem is within Docker (or rather containerd) itself.

So the problem most likely is related to the Docker 23.0 upgrade. To isolate that, it might make sense to build an image using HAOS 9.5, but just upgrade Docker to 23.0, to verify that the problem indeed is related to Docker 23.0. If that is proven, opening an issue in the Docker GitHub project (moby) is probably the next step.

However, in my experience, just creating an issue is unlikely to trigger a quick fix. To get it resolved, likely we'd have to track down the actual issue ourselfs. Since it is the containerd process which is causing higher CPU load, we'd have to dig into why it uses more CPU. One option is using some kind of profiling. I am not familiar with Go profiling, but I am sure there are ways to profile a go process to figure out which operations use (more) CPU than they did before.

During the hole process it could also be that feature x of containerd just requires more CPU, and this was expected, in which case the whole endeavor was useless.

I am expecting this to take a significant amount of work to tackle. Since this "only" affects CPU usage, it isn't high on my priority list right now. I also have some hope that some stable package update (e.g. a new Docker patch release) suddenly resolves the issue :crossed_fingers:

elmr91 commented 1 year ago

An old containerd issue provides some hints about debugging/profiling containerd/docker: https://github.com/fnproject/fn/issues/700

WouterJN commented 1 year ago

I'm running on a Proxmox server and after upgrading from 9.x to 10.1 i have the same problem;

image

CPU went from 6% to 15%. Supervisor in both cases is 2023.04.1. I dont use the motionEye add-on.

tyjtyj commented 1 year ago

Running 4 HA on 3 different hardware seeing the same issue. RPI4B rev1 for stanby. RPI3B for test, RPI3B in different home with different config and Intel NUC on proxmox. All 4 of them seeing same CPU increase behavior. The issue disappear by going back to 9.5. Like other said, seems containerd is the culprit. No motion eye. All 4 system has different addons. The only common addons in terminal.

Impact123 commented 1 year ago

I found this interesting and figured I'll try to replicate this. I used a basic debian 11 VM, cloned it and installed docker/containerd 23.0.6/1.6.21 (latest at this time) on one and 20.10.22/1.6.8 (What HAOS 9.5 uses) via the new --version parameter of docker's installation script. I used apt install containerd.io=1.6.8-1 to downgrade containerd and rebooted. I then used the docker run command from the docs and let the VMs idle for a bit. I didn't complete the onboarding, just visited to check if it was up. I used proxmox's metric server output to gather the stats. You can also see a snapshot of it here. Maybe my testing is too shallow and flawed but I can't reproduce this behavior outside of HAOS. At least it doesn't seem to be a general docker/containerd issue. I can't really find any recent cpu related complaints in the moby/docker/containerd repos either. image

gfn256 commented 1 year ago

Having now measured my power consumption - I can confirm increased wattage from HAOS 9.5 to 10.1 of more than 2 watts. (10-11w on 9.5, 13w on 10.1). I know 2w isn't that much (approx. US$2.88 per year locally), but what concerns me more is the added "wear" on hardware of more than 18% in energy. I hope we get a solution, other than going back to "old" system!

NOTE: I believe I saw a further rise in CPU usage after updating recently to latest HA core 2023.5.2. (Above figures are after this update).

elmr91 commented 1 year ago

It may be worth trying a cross-test (if possible):

But I don't know how to proceed.

agners commented 1 year ago

@Impact123 these are very interesting findings! I guess this could mean two things:

a) Latest docker/containerd 23.0.6/1.6.21 fixes the problem b) The docker/containerd build config in HAOS is different c) It is related to some other environment

I'd say c) is most likely, and there are lots of candidates: Go version used, glibc version or the Linux kernel are probable candidates. To rule out a) I can bump to this latest version relatively easy in HAOS 11 dev builds, I'll tackle that today. Then we can test if that fixes the problem. If not, we know it must be caused by something else than containerd itself.

@elmr91 building HAOS from scratch is documented on the developer website. Of course it requires manually adjusting Docker/containerd package versions. I'll tackle that at one point, but currently the Bluetooth issues have higher priority.

elmr91 commented 1 year ago

@Impact123, could you also make the same test on your VM with the docker/containerd version used by HAOS10 ? (it seems you compared docker version from HAOS 9.5 to latest) I think you should not see any difference (indicating the behavior we observe is linked to containerd environment)

dsolva commented 1 year ago

NOTE: I believe I saw a further rise in CPU usage after updating recently to latest HA core 2023.5.2. (Above figures are after this update).

I know it might be off topic but I experienced the same. Now I have an extra 10% from 2023.5.2.

elmr91 commented 1 year ago

I didn't experience the same with HAOS 9.5 + 2023.5.2 (no noticeable CPU difference compared to last month - I also added few devices during this time)

image

Impact123 commented 1 year ago

@elmr91 Here's the same of just HA idling on the onboarding page but with 23.0.6/1.6.21 and 23.0.3/1.6.20. image

root@Debian:~# docker ps
CONTAINER ID   IMAGE                                          COMMAND   CREATED             STATUS             PORTS     NAMES
20da05ac337d   ghcr.io/home-assistant/home-assistant:stable   "/init"   About an hour ago   Up About an hour             homeassistant

root@Debiantemp2:~# docker ps
CONTAINER ID   IMAGE                                          COMMAND   CREATED             STATUS             PORTS     NAMES
44de036f8143   ghcr.io/home-assistant/home-assistant:stable   "/init"   About an hour ago   Up About an hour             homeassistant
root@Debian:~# hostname; docker -v; containerd -v
Debian
Docker version 23.0.6, build ef23cbc
containerd containerd.io 1.6.21 3dce8eb055cbb6872793272b4f20ed16117344f8

root@Debiantemp2:~# hostname; docker -v; containerd -v
Debiantemp2
Docker version 23.0.3, build 3e7cbfd
containerd containerd.io 1.6.20 2806fc1057397dbaeefbea0e4e17bddfbd388f38

For reference

# ha os info | grep version:; docker -v; containerd -v
version: "10.0"
Docker version 23.0.3, build 23.0.3
containerd github.com/containerd/containerd 1.6.20

If I find some time I'll check if I'm able to grasp how to build HAOS with specific versions.

lepgithub commented 1 year ago

I'll add that I also noticed a significant increase in cpu usage from ~3% to ~9% when I moved from HAOS 9.5 to 10.0 or 10.1. Running on Proxmox 7.3-6 (i7-6700T). It definitely appears to be an issue with HAOS 10.0/10.1. And a few watts of increase do add up when you are paying $0.82 per kWh summer peak (Thanks SDG&E).

agners commented 1 year ago

The latest dev builds use Docker 23.0.6 along with updated runc/containerd components. I don't expect it to change anything, but maybe worth a try: https://os-builds.home-assistant.io/11.0.dev20230509/

Impact123 commented 1 year ago

Here are the 9.5, 10.1 and 11.0-dev20230509 VMs idling. These are completely fresh installs created/imported from the ova without any OS modifications. I didn't even visit the web interface this time.

15m

image

30m

image

60m

The dip comes from activity on another VM image

Same data but from Proxmox's interface (hour average) in case you don't trust my stats #### 9.5 ![image](https://github.com/home-assistant/operating-system/assets/899193/15d534b6-72f2-422d-a025-b2bc45081b44) #### 10.1 ![image](https://github.com/home-assistant/operating-system/assets/899193/112a80c9-3e31-47f8-97ad-c874eb050390) #### 11.0-dev20230509 ![image](https://github.com/home-assistant/operating-system/assets/899193/4d9aae04-5b75-4455-9fe6-927cfb45cc08)
VM details in case it's important Imported via `qm disk import VMID haos_ova-OSVERSION.qcow2 vmstore` from `haos_ova-OSVERSION.qcow2.xz` file(s). ``` # 9.5 agent: 1,fstrim_cloned_disks=1 balloon: 1024 bios: ovmf boot: order=scsi0;ide2 cores: 2 cpu: host efidisk0: vmstore:vm-113-disk-0,efitype=4m,size=4M ide2: none,media=cdrom memory: 2048 meta: creation-qemu=7.2.0,ctime=1683737865 name: haos95 net0: virtio=62:F0:F5:14:28:B7,bridge=vmbr0,firewall=1 numa: 0 ostype: l26 parent: Base scsi0: vmstore:vm-113-disk-1,discard=on,iothread=1,size=32G,ssd=1 scsihw: virtio-scsi-single smbios1: uuid=4ddc9496-9fc7-4ae5-8211-b064dab0dc25 sockets: 1 tablet: 0 vmgenid: ddbecad9-f0cd-4728-a06b-af55d2f464bb # 10.1 agent: 1,fstrim_cloned_disks=1 balloon: 1024 bios: ovmf boot: order=scsi0;ide2 cores: 2 cpu: host efidisk0: vmstore:vm-114-disk-0,efitype=4m,size=4M ide2: none,media=cdrom memory: 2048 meta: creation-qemu=7.2.0,ctime=1683737865 name: haos101 net0: virtio=96:4E:DA:38:2E:6C,bridge=vmbr0,firewall=1 numa: 0 ostype: l26 parent: Base scsi0: vmstore:vm-114-disk-1,discard=on,iothread=1,size=32G,ssd=1 scsihw: virtio-scsi-single smbios1: uuid=1fdf6db6-c3ab-4e72-885a-fb22843686ce sockets: 1 tablet: 0 vmgenid: 8109d26b-8620-4d4e-9a02-af346381e0f7 # 11.0 agent: 1,fstrim_cloned_disks=1 balloon: 1024 bios: ovmf boot: order=scsi0;ide2 cores: 2 cpu: host efidisk0: vmstore:vm-115-disk-2,efitype=4m,size=4M ide2: none,media=cdrom memory: 2048 meta: creation-qemu=7.2.0,ctime=1683737865 name: haos11 net0: virtio=9E:9F:79:B6:10:2B,bridge=vmbr0,firewall=1 numa: 0 ostype: l26 parent: Base scsi0: vmstore:vm-115-disk-1,discard=on,iothread=1,size=32G,ssd=1 scsihw: virtio-scsi-single smbios1: uuid=8f765d7b-94a3-4891-a46b-ffc41a1e96fc sockets: 1 tablet: 0 vmgenid: 8d2d465f-cb30-4a35-9efa-ba70826f540a ``` Less than 50% of the host's memory is used so ballooning shouldn't matter here.
gfn256 commented 1 year ago

Have another interesting observation to report:

Since updating my Proxmox VE a number of days ago, which included a kernel-update, I have noted an average temp DECREASE of about 2 degrees Celsius. Good news! I am currently only running an HAOS 10.1 VM.

Can anyone else confirm something like this?

This is my current Proxmox VE details:

Kernel Version Linux 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) PVE Manager Version pve-manager/7.4-3/9002ab8a

yildiraymeric commented 1 year ago

I've Intel(R) Atom(TM) x5-Z8500 CPU @ 1.44GHz system and I see approx 2 watts more power usage after upgrading from 9.5 to 10.1, here is the screen shot of my power meter readings;

HA-power-problem

dsolva commented 1 year ago

Anyone tested OS 10.2?

evilspoons commented 1 year ago

Going from 10.1 to 10.2 on my Proxmox VE setup, memory usage plummeted (4.5 GB to 1.6 GB) but CPU usage increased another ~36% (2.8% to 3.8%). This is of course on top of the +45% I had on 9.5->10.0 (2.4% cpu average to 3.5% cpu average), which means CPU usage has basically doubled. (1.36*1.45 = 197% of original CPU load).

The white line in the first image is when the upgrade ran from 10.1 to 10.2.

cpu memory usage

lyricnz commented 1 year ago

Bump. This is still an issue, and doesn't seem to be getting any attention from HAOS devs. How can we help?

andrazek commented 1 year ago

Low power consumption should always be top priority. Is this issue related to only few HW configurations? I use PN41 with N6000 (passively cooled), definitely want to keep resources available for other proccesses and HA efficiency is very important. Has anyone noticed a similar increase with their RPi?

geobogb commented 1 year ago

Has anyone noticed a similar increase with their RPi?

Not with RPi but with HA in VirtualBox on a Linux/Debian host. Still on 9.5 since 10.0 was more or less unusable due to high CPU usage on the host. Went from single digit CPU to around 30 % CPU.

WouterJN commented 1 year ago

Can we assist in any way to troubleshoot this issue?

andrazek commented 1 year ago

Not with RPi but with HA in VirtualBox on a Linux/Debian host. Still on 9.5 since 10.0 was more or less unusable due to high CPU usage on the host. Went from single digit CPU to around 30 % CPU.

So this has to do something with virtualization? For native installation no CPU increase?

tyjtyj commented 1 year ago

Not with RPi but with HA in VirtualBox on a Linux/Debian host. Still on 9.5 since 10.0 was more or less unusable due to high CPU usage on the host. Went from single digit CPU to around 30 % CPU.

So this has to do something with virtualization? For native installation no CPU increase?

Nope. It's affecting native rpi 3B and 4B.

lyricnz commented 1 year ago

FWIW I reproduced the experiments from @Impact123 https://github.com/home-assistant/operating-system/issues/2476#issuecomment-1542574092 and had exactly the same results. Even with brand-new installs, at the onboarding screen on each:

9.5

Screenshot 2023-06-08 at 11 29 03 am

10.2

Screenshot 2023-06-08 at 11 29 12 am

dev

Screenshot 2023-06-08 at 11 29 20 am
lyricnz commented 1 year ago

FWIW 9.5 is kernel 5.15.90, while 10.0 is 6.1.24 (and current dev HAOS is 6.1.32)

evilspoons commented 1 year ago

Not with RPi but with HA in VirtualBox on a Linux/Debian host. Still on 9.5 since 10.0 was more or less unusable due to high CPU usage on the host. Went from single digit CPU to around 30 % CPU.

So this has to do something with virtualization? For native installation no CPU increase?

No, it's affecting everything. It's just easier to show results with virtualized installs because they give you nice statistics. 'Containerd' inside HAOS is for running the internal service containers that are part of HAOS (most obviously the add-ons, like esphome), whether you're on bare metal or on a virtualized install.

dsolva commented 1 year ago

Is there any way we can help with this? 9.5 is getting old and I cant use the network storage feature 🫠

wokkeltje13 commented 1 year ago

Upcoming 10.3 does again not make any difference for the increased CPU usage. Think we need to live with it :-(

andrazek commented 1 year ago

Nope, that should not be the case. It needs to be resolved somehow. From my experiences I can tell that lifting the kernel version of Ubuntu from 5.15 also worsened idle performance leading to +5°C, but at least I was able to resolve this by masking a specific gpe interrupt parameter and the kworker load finally went back down from 5% to 0% in idle. Is there a way to run the latest HAOS version with the old kernel?

WouterJN commented 1 year ago

Upcoming 10.3 does again not make any difference for the increased CPU usage. Think we need to live with it :-(

For me 10.3 also didn't solve it.

elmr91 commented 1 year ago

It may be an incompatibility between kernel build option in 10.x and what docker is relying on. It is possible this kernel facility was compiled in kernel 5.15 (in HAOS 9x) but missing in new kernel tree (HAOS 10.x) docker may use another mechanism (eating more CPU) to workaround this missing kernel/libc feature.
Only a guess...

Christophe999s commented 1 year ago

I've spun up a Debian 11 VM and installed HA Supervised, and the CPU usage is the same as with OS 9.5. So no increase in CPU usage like in OS 10. Linux 5.10.0-23-amd64 x86_64 Docker version 24.0.2, build cb74dfc containerd containerd.io 1.6.21 3dce8eb055cbb6872793272b4f20ed16117344f8 Everything was installed following the instructions on https://github.com/home-assistant/supervised-installer

gfn256 commented 1 year ago

10.3 definitely doesn't fix it! I also just updated Proxmox to version 8, and this also didn't fix it. In fact it may be even worse! (~ +1°C warmer).

N3rdix commented 1 year ago

Interestingly, in contrast to other user reports here, I can't see any CPU increase on a fresh installation on a RPI4-64. I started with a new 9.5 installation and then updated to 10.3 (red line)...

image

I'll restore a backup with all integrations I use and see if this increases the CPU usage that much

dsolva commented 1 year ago

10.3 definitely doesn't fix it! I also just updated Proxmox to version 8, and this also didn't fix it. In fact it may be even worse! (~ +1°C warmer).

I also expirienced this. However, it's a different issue. Before updating to OS 10 y moved from VE ~7.1 to 7.4 and saw a huge decrease on power consumption. Then OS 10 and saw improvement vanish with containerd problem. Now, updating to VE 8 it increased in the same proportion than when it decreased from 7.1 to 7.4. I guess whatever efficiency 7.4 brought was lost with 8.

Off topic but what I observe is that proxmox reports higher cpu usage on the vm than what HA says, i suppose that extra cpu comes from virtualization inneficiencies. Something related to the kernel or qemu in proxmox just makes me get extra 20% cpu usage on HA VM.

Both issues are driving me crazy, both accounting for up to 2W (+40%) power consumption. I guess I will start moving my addons to my docker LXC and eventually move to HA container.

For others experiencing cpu increase, try to consider that VE 8 might play a role and use that knowledge to measure the drop in performance only for OS >10. With VE 8 for me proxmox reported a cpu increase and remained, but HA cpu increased (inside) for a few days and regularized.

N3rdix commented 1 year ago

In the meantime i updated to 10.3 in an isolated network (means all integrations/addons active but no communication to network devices) and everything looks normal so far. No CPU increase, running pretty stable at ~4% as before the update.

I am confident enough to update my productive Pi4 as well 👍

elmr91 commented 1 year ago

Staying on HAOS 9.5 in proxmox VM, I upgraded host from Proxmox 7.4 to 8.0 I also noticed (as @dsolva in comment) a power CPU/consumption increase from a constant 7W to 8-9W with frequent spikes to 20W (several per minute)

CPU spikes were not related to HAOS VM but proxmox host itself. I managed to tune proxmox VE 8.0 kernel to lower CPU and 5-6W power consumption (using powertop / disabling power tuning on GPU causing spikes) image

This not directly related to containerd problem on HAOS 10. But it shows how a kernel change may directly impact CPU/poer consumption.

Next step is to retry upgrade on VE 8 from HAOS 9.5 to 10.3 and monitor CPU/power consumption.

atv2016 commented 1 year ago

I am also seeing the same issue, not sure when it started but CPU usage went significantly up. I thought it might have something to do with the upgrade of Ubuntu or the VMware player, but after reading this, docker looks to be the culprit.

On a sidenote, why not just use pod man? Would that not use less overhead and be quicker as well ?

aborrajo commented 1 year ago

I have the same problem with HAOS 10.x on a proxmox 7.4 virtual machine. With HAOS 9.5 I have an average CPU usage of 2-2.5%, while if I upgrade to HAOS 10.3 for example, the CPU usage goes up to 5-6%. It seems that none of the HAOS 10.x updates are taking into account this problem that we have been having for some months now. I would like to upgrade to HAOS 10.x someday.

N3rdix commented 1 year ago

Doesn't seem to be a general but more a hardware/virtualization specific issue. I read a lot of "Proxmox" and "VM". I don't have any issue on two Raspberry Pi 4 with 10.3 since the update a few days back (this thread here initially made me skeptically and let me wait that long before I finally upgraded).

mundschenk-at commented 1 year ago

The issue was definitely noticeable on Yellow/CM4 as well.