home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.67k stars 30.42k forks source link

2021.12.3 causes high CPU and Memory Load #62309

Closed criticallimit closed 2 years ago

criticallimit commented 2 years ago

The problem

After update to 2021.12.3 my CPU ans Memory load is extremly high. Normally it is CPU 0.5-3% and memory 1-5% After update CPU is minimum 25% and Memory is going more and more up to 100%. After reaching Limit, HA is not longer reachable and has to be restartet. This takes 2-10 hours while memory usage is going up and up.

I´m on Raspberry Pi 4 using zha and Zwave

Same on 2021.12.4 Memory Usage is growing up to 100% resulting in System is unreachable and not reacting on switches or automations

What version of Home Assistant Core has the issue?

2021.12.3 and 2021.12.4

What was the last working version of Home Assistant Core?

2021.11.?

What type of installation are you running?

Home Assistant OS

Integration causing the issue

?

Link to integration documentation on our website

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

nothing in log

Additional information

No response

fi-sch commented 2 years ago

Same here. I noticed that my instance is restarting itself every day. I am using HA for about 2 years and nothing like that happened before. Also, nothing in the logs (both HA Core and Supervisor).

pinsdorf commented 2 years ago

Hi @fi-sch (pun intended 😀)

Did you try to increase the size of your swap file as per this article?

I've had the same problem on an HA 2021.11.* installation. After increasing the swap file my memory consumption is still high, but stays under a critical threashold (maxes out approx. 85% memory use on a Raspi 3B) and my HA did not crash since this change.

Before, HA core crashed presumably due to too high memory demand while the OS kept running. I could still ping and ssh into the machine, but not reach it via browser or mobile app. Also automations were not executed, recorder did not write values, etc.

Having said that, I'd love to understand better where the memory is being consumed and how to improve it.

fi-sch commented 2 years ago

Hi @fi-sch (pun intended 😀)

Did you try to increase the size of your swap file as per this article?

I've had the same problem on an HA 2021.11.* installation. After increasing the swap file my memory consumption is still high, but stays under a critical threashold (maxes out approx. 85% memory use on a Raspi 3B) and my HA did not crash since this change.

Before, HA core crashed presumably due to too high memory demand while the OS kept running. I could still ping and ssh into the machine, but not reach it via browser or mobile app. Also automations were not executed, recorder did not write values, etc.

Having said that, I'd love to understand better where the memory is being consumed and how to improve it.

I did not increase swap nor did any other hack with the HAOS. I don't think that swap or RAM itself should be an issue. I have dedicated 8GB of RAM to the VM with HAOS, which is plenty. Right how I have 50% consumed by HAOS and all running containers, so there's still 4GB free. Swap usage is 19% out of 2GB. So I don't see a point of doing any adjustments here.

Anyways, thanks for your suggestion :)

bdraco commented 2 years ago

Please post a py-spy recording and callgrind.out.X file from the Profiler integration

fi-sch commented 2 years ago

Please post a py-spy recording and callgrind.out.X file from the Profiler integration

Should I exec into homeassistant container and install py-spy there?

bdraco commented 2 years ago

Please post a py-spy recording and callgrind.out.X file from the Profiler integration

Should I exec into homeassistant container and install py-spy there?

Yes

fi-sch commented 2 years ago

Yes

Thanks. Here you go: recording.zip

What I need to mention is that in my case right now the CPU is usage rather +/- normal. It is just raising over time to the point where I get CPU temperature reaching 100°C during high CPU load. Then HA container appears to restart itself. That happened today and yesterday. I am running HAOS as a VM on Intel NUC8 with i3 CPU with PVE, 2 out of 4 cores and 8GB of RAM is assigned to the VM.

bdraco commented 2 years ago

There wasn't anything too interesting in the data so I suspect you'll have to do it again when the issue is happening

You do have an onvif camera that is trying to setup over and over that is using about 15% of the event loop time. I'm kinda surprised how much cpu time that uses to retry setup. Its not the issue though

Screen Shot 2021-12-30 at 10 06 40
fi-sch commented 2 years ago

There wasn't anything too interesting in the data so I suspect you'll have to do it again when the issue is happening

You do have an onvif camera that is trying to setup over and over that is using about 15% of the event loop time. I'm kinda surprised how much cpu time that uses to retry setup. Its not the issue though

Thanks much for checking. I will set up an automation that should run the Profiler when the issue arises. Today, for example, I got notification about high CPU temperature, but I was not quick enough to run glances to check running processes and containers.

Good catch about the camera. I will disable the integration until I fix that device.

criticallimit commented 2 years ago

No matter how big the swap file is, Problem still exists. Memory is completely used over time. HA not accessible nor doing anything. Switching on and off again everything is working for a few hours till memory is full again. Very frustrating to reset HA every day

fi-sch commented 2 years ago

No matter how big the swap file is, Problem still exists. Memory is completely used over time. HA not accessible nor doing anything. Switching on and off again everything is working for a few hours till memory is full again. Very frustrating to reset HA every day

If you still need help and are able to catch/reproduce this issue, then please provide some helpful info as instructed above.

criticallimit commented 2 years ago

Would love to add Infos, but there are no Warnings or Faults. Just Memory Usage is going up when I switch a light or an automation is startet. Switch the light off or the automation has finished doesnt release the Memory. Over the time memory is used up to 100% and HA crashes. The only Fault I have is: "unknown cluster 61184" at startup How to find what causes this warning I don´t know No hardware changes till 4 month, no changes in automations or integrations added. Can´t remember exactly, but since 2021.12.x I have to restart every few hours

fi-sch commented 2 years ago

@criticallimit, I meant this: https://github.com/home-assistant/core/issues/62309#issuecomment-1003125176

You can exec into HA container using this command: docker exec -it homeassistant /bin/bash

To use terminal on HAOS, you can install 'SSH & Terminal' add-on.

criticallimit commented 2 years ago

tried to install py-spy without success. Lot of fault messages. And I´m not that Pro to handle this. Will wait and hope for a solution. Thanks a lot spending time on my Problem

Bildschirmfoto 2

criticallimit commented 2 years ago

Maybe this can help? Bildschirmfoto 2

fi-sch commented 2 years ago

tried to install py-spy without success. Lot of fault messages. And I´m not that Pro to handle this. Will wait and hope for a solution. Thanks a lot spending time on my Problem

Did you try: echo 'manylinux1_compatible = True' > /usr/local/lib/python3.9/site-packages/_manylinux.py before running pip install py-spy? Notice 'python3.9' instead of 'python3.8' in original instructions.

criticallimit commented 2 years ago

Bildschirmfoto

adrianandreias commented 2 years ago

I’ve noticed high CPU usage, but low memory usage and no crash, just noisy NUC fan runs continuously: https://community.home-assistant.io/t/nuc-fan-spinning-continuously-after-jan-2022-updates/386822

github-actions[bot] commented 2 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.