home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.43k stars 30.67k forks source link

Home Assistant runs out of memory and is restarted #69695

Closed fishcharlie closed 2 years ago

fishcharlie commented 2 years ago

The problem

Since upgrading to 2022.4.0, my Home Assistant instance restarts constantly. About every few hours or so. Sometimes every hour.

This continues to occur after updating to 2022.4.1.

Normally I'd provide more information or try to debug this myself, but I don't know where to begin. It looks like logs get wiped out every time it restarts, so I can't view the logs right before the restart that might have caused it.

What version of Home Assistant Core has the issue?

core-2022.4.1

What was the last working version of Home Assistant Core?

core-2022.3.6???

What type of installation are you running?

Home Assistant OS

Integration causing the issue

No response

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

Screen Shot 2022-04-08 at 2 03 28 PM

Attached here you can see a picture of my uptime history. Right once I upgraded to 2022.4.0, things started to fail.

I'd be happy to provide more logs and stuff, but need some guidance on what information to provide and how to provide it.

mib1185 commented 2 years ago

the current home-assistant.log is moved to home-assistant.log.1 at restart, so please have a look/provide the last lines of home-assistant.log.1 after the unexpected restart occurs again

fishcharlie commented 2 years ago

@mib1185 I don't see any revelant information there.

Screen Shot 2022-04-09 at 10 32 28 AM

So that made me think to look in the supervisor logs. And I found this:

Screen Shot 2022-04-09 at 10 43 35 AM

Maybe this is a problem with Home Assistant OS?? I'm not quite sure what next steps are here.

mib1185 commented 2 years ago

both screenshots seems not to belong to each other (different timestamps) 🤔 please try tail -20 home-assistant.log.1 instead of cat ... further please provide output of dmesg | tail -20 last but not least, perform both above checks after the next unexpected restart occur and also provide the supervisor log for this time frame.

fishcharlie commented 2 years ago

@mib1185 Just happened again.

Screenshot of the two comments you wanted me to run:

Screen Shot 2022-04-09 at 11 44 13 AM

Supervisor logs:

Screen Shot 2022-04-09 at 11 44 34 AM

This looks to have happened at 11:43am mountain (local) time.

Screen Shot 2022-04-09 at 11 45 02 AM
mib1185 commented 2 years ago

You encounter an out-of-memory (oom) condition could you please provide the full home-assistant.log.1 and provide the dmesg for at least last 100 (instead of 20) lines after this happen again.

fishcharlie commented 2 years ago

@mib1185 Here you go.

Screen Shot 2022-04-09 at 1 49 18 PM

home-assistant.log.1.zip

dmesg | tail -100
[232106.118623] [    360]     0   360     2377       99    45056      107             0 wpa_supplicant
[232106.118629] [    397]     0   397      531        5    40960       23             0 hciattach
[232106.118635] [    399]     0   399     1884       40    53248       63             0 bluetoothd
[232106.118641] [    431]     0   431   559261     5270   499712     1929             0 dockerd
[232106.118647] [    440]     0   440   368570     2131   241664      694             0 containerd
[232106.118653] [   1276]     0  1276   315435     1381   221184      470             0 docker
[232106.118659] [   1277]     0  1277      829        1    36864       31             0 hassos-cli
[232106.118665] [   1329]     0  1329   381432      448   184320      205             1 containerd-shim
[232106.118671] [   1351]     0  1351       49        0    28672        4             0 s6-svscan
[232106.118677] [   1449]     0  1449       49        0    28672        3             0 s6-supervise
[232106.118683] [   1631]     0  1631       49        0    28672        4             0 s6-supervise
[232106.118689] [   1632]     0  1632       49        0    28672        4             0 s6-supervise
[232106.118695] [   1635]     0  1635     1094      147    49152      397             0 bash
[232106.118701] [   1636]     0  1636    52602    14821   430080     4347             0 python3
[232106.118707] [   1774]     0  1774   381080      405   180224      222             1 containerd-shim
[232106.118713] [   1795]     0  1795       49        0    28672        4             0 s6-svscan
[232106.118719] [   1872]     0  1872       45        0    16384        6             0 foreground
[232106.118725] [   1873]     0  1873       49        0    28672        3             0 s6-supervise
[232106.118731] [   1890]     0  1890       44        0    16384        4             0 foreground
[232106.118737] [   1913]     0  1913   370907     1364   241664      473             0 docker
[232106.118743] [   1976]     0  1976      653        1    40960       86             0 cli.sh
[232106.118749] [   2057]     0  2057      411        0    36864       12             0 sleep
[232106.118755] [   2076]     0  2076   381016      418   176128      183             1 containerd-shim
[232106.118761] [   2096]     0  2096       49        0    28672        5             0 s6-svscan
[232106.118767] [   2174]     0  2174       49        0    28672        3             0 s6-supervise
[232106.118773] [   2335]     0  2335   381144      358   184320      215             1 containerd-shim
[232106.118779] [   2360]     0  2360       49        0    28672        4             0 s6-svscan
[232106.118785] [   2397]     0  2397       49        0    28672        3             0 s6-supervise
[232106.118791] [   2401]     0  2401   180002     2341   122880      534             0 coredns
[232106.118797] [   2486]     0  2486       49        0    28672        3             0 s6-supervise
[232106.118803] [   2653]     0  2653   381432      339   184320      202             1 containerd-shim
[232106.118809] [   2685]     0  2685       49        0    28672        5             0 s6-svscan
[232106.118815] [   2760]     0  2760       49        0    28672        4             0 s6-supervise
[232106.118821] [   2933]     0  2933       49        0    28672        3             0 s6-supervise
[232106.118827] [   2939]     0  2939      218       36    32768        2             0 mdns-repeater
[232106.118833] [   3125]     0  3125       49        0    28672        4             0 s6-supervise
[232106.118839] [   3126]     0  3126       49        0    28672        4             0 s6-supervise
[232106.118845] [   3129]     0  3129    23668      134    90112      544             0 pulseaudio
[232106.118850] [   3130]     0  3130     1080        1    40960      504             0 bash
[232106.118856] [   3155]     0  3155     1081        0    40960      503             0 bash
[232106.118862] [   3156]     0  3156     1256        1    40960       79             0 udevadm
[232106.118868] [   3165]     0  3165      501        1    36864      100             0 rlwrap
[232106.118874] [   3166]     0  3166      427        0    28672       12             0 cat
[232106.118880] [   3178]     0  3178   305487        0   126976      191             0 docker-proxy
[232106.118886] [   3184]     0  3184   287039        0   118784      184             0 docker-proxy
[232106.118892] [   3199]     0  3199   269007        0   118784      190             0 docker-proxy
[232106.118898] [   3205]     0  3205   268943        0   110592      165             0 docker-proxy
[232106.118904] [   3217]     0  3217   287039        0   122880      190             0 docker-proxy
[232106.118910] [   3224]     0  3224   305839        0   135168      196             0 docker-proxy
[232106.118916] [   3238]     0  3238   287103        0   122880      184             0 docker-proxy
[232106.118922] [   3244]     0  3244   287103        0   122880      215             0 docker-proxy
[232106.118928] [   3259]     0  3259   399944      487   188416      219             1 containerd-shim
[232106.118934] [   3278]     0  3278      202        4    28672        7             0 docker-init
[232106.118940] [   3330]     0  3330       49        0    28672        4             0 s6-svscan
[232106.118946] [   3359]     0  3359       49        0    28672        4             0 s6-supervise
[232106.118952] [   3597]     0  3597       49        0    28672        4             0 s6-supervise
[232106.118958] [   3598]     0  3598       49        0    28672        3             0 s6-supervise
[232106.118964] [   3601]     0  3601    10204      156   122880     8661             0 mosquitto
[232106.118970] [   3602]     0  3602     1432        9    36864      173             0 nginx
[232106.118976] [   3633]     0  3633     1448       58    36864      143             0 nginx
[232106.118982] [   3683]     0  3683   305551        0   131072      192             0 docker-proxy
[232106.118988] [   3691]     0  3691   287039        0   122880      181             0 docker-proxy
[232106.118995] [   3705]     0  3705   399816      564   188416      223             1 containerd-shim
[232106.119001] [   3728]     0  3728       49        0    28672       12             0 s6-svscan
[232106.119007] [   3807]     0  3807       49        0    28672        3             0 s6-supervise
[232106.119013] [   4182]     0  4182   399944      503   192512      188             1 containerd-shim
[232106.119019] [   4207]     0  4207       49        0    28672        5             0 s6-svscan
[232106.119025] [   4348]     0  4348       49        0    28672        4             0 s6-supervise
[232106.119031] [   4416]     0  4416       49        0    28672        4             0 s6-supervise
[232106.119037] [   4417]     0  4417       49        0    28672        3             0 s6-supervise
[232106.119043] [   4420]     0  4420     1079       11    40960      113             0 sshd
[232106.119049] [   4422]     0  4422     5383       29    73728     4171             0 ttyd
[232106.119055] [   4582]     0  4582   399944      465   192512      206             1 containerd-shim
[232106.119062] [   4629]     0  4629       49        0    28672        4             0 s6-svscan
[232106.119068] [   4722]     0  4722       49        0    28672        3             0 s6-supervise
[232106.119074] [   5187]     0  5187       49        0    28672        3             0 s6-supervise
[232106.119080] [   5190]     0  5190    73965     2674   729088     8089             0 node
[232106.119086] [   5453]     0  5453       49        0    28672        3             0 s6-supervise
[232106.119092] [   5458]     0  5458   172515        0   462848     4851             0 node
[232106.119099] [   5595]     0  5595   156156      143   491520     5195             0 node
[232106.119105] [ 263540]     0 263540   305903        0   139264      191             0 docker-proxy
[232106.119111] [ 263547]     0 263547   323999        0   139264      202             0 docker-proxy
[232106.119117] [ 263564]     0 263564   381432      375   184320      209             1 containerd-shim
[232106.119123] [ 263582]     0 263582       49        0    28672        4             0 s6-svscan
[232106.119129] [ 263707]     0 263707       49        0    28672        3             0 s6-supervise
[232106.119135] [ 263849]     0 263849       49        0    28672        3             0 s6-supervise
[232106.119141] [ 263852]     0 263852   177621      652    86016      212             0 observer
[232106.119148] [ 435164]     0 435164    19341      126   135168       70          -250 systemd-journal
[232106.119154] [ 435181]     0 435181     2907      108    49152       76             0 systemd-logind
[232106.119160] [ 439613]     0 439613     1157       60    36864      162             0 sshd
[232106.119166] [ 439615]     0 439615      976      124    32768      298             0 bash
[232106.119172] [ 449781]     0 449781   381096      449   184320      158             1 containerd-shim
[232106.119178] [ 449801]     0 449801       49        0    28672        6             0 s6-svscan
[232106.119184] [ 449846]     0 449846       49        0    28672        4             0 s6-supervise
[232106.119190] [ 449995]     0 449995       49        0    28672        3             0 s6-supervise
[232106.119196] [ 449998]     0 449998   290696   134506  2146304     7459             0 python3
[232106.119206] [ 456484]     0 456484      411        1    32768        0             0 sleep
[232106.119213] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=ffda2eafd2e191ea4515316f4e1275ad612e927ef82bed1bcc26c5ff50a7d64e,mems_allowed=0,global_oom,task_memcg=/docker/827ab3124f6620d564ebbb1b52ecebd8fb1d8f77901ddfede1e7ab531dbd8cf5,task=python3,pid=449998,uid=0
[232106.119319] Out of memory: Killed process 449998 (python3) total-vm:1162784kB, anon-rss:535256kB, file-rss:2768kB, shmem-rss:0kB, UID:0 pgtables:2096kB oom_score_adj:0
[232106.264600] oom_reaper: reaped process 449998 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
balloob commented 2 years ago

This issue has been confirmed in multiple places. We want to track progress in this issue.

To find out what is causing this issue we need to know what is going on inside the Python process. There are 2 approaches, Py-Spy is preferred, but our Profiler integration can be set up via the UI.

This issue is/was also discussed at #69728, forums 1, Reddit

Update: please drag files into comment field so they are attached as files instead of pasting content

woopsicle commented 2 years ago

Thanks!

I have attached the heap profile log, profile.start output and a ~15 min sample of running start_log_object below.

Let me know what profile info will be useful. I am having trouble installing py-spy (heaps of pip install errors).

Archive.zip


2022-04-10 14:16:53 CRITICAL (SyncWorker_1) [homeassistant.components.profiler] Memory Growth: [('builtin_function_or_method', 1
1983, 612), ('weakref', 26478, 480), ('tuple', 89646, 6), ('InstanceState', 977, 3), ('function', 99424, 2), ('set', 7455, 1), ('LogEntry', 6, 1), ('Script', 87, 1), ('States', 440, 1), ('Events', 440, 1), ('StateAttributes', 96, 1)]
2022-04-10 14:17:23 CRITICAL (SyncWorker_1) [homeassistant.components.profiler] Memory Growth: [('weakref', 26583, 105), ('dict', 99281, 10), ('tuple', 89654, 8), ('method', 6194, 7), ('SimpleCookie', 81, 2), ('RequestHandler', 43, 2), ('HttpRequestParser', 39, 2), ('AccessLogger', 43, 2), ('Response', 4, 2), ('deque', 901, 1), ('socket', 126, 1), ('SelectorKey', 103, 1), ('Handle', 106, 1), ('KeyedRef', 163, 1), ('TransportSocket', 101, 1), ('_SelectorSocketTransport', 86, 1), ('TimerContext', 38, 1)]
2022-04-10 14:17:53 CRITICAL (SyncWorker_13) [homeassistant.components.profiler] Memory Growth: [('tuple', 89680, 26), ('builtin_function_or_method', 11990, 7), ('cell', 59111, 6), ('list', 45318, 5), ('function', 99429, 5), ('deque', 904, 3), ('TimerContext', 40, 2), ('StreamReader', 15, 2), ('Handle', 107, 1), ('Condition', 103, 1), ('FileIO', 12, 1), ('BufferedReader', 6, 1), ('ResponseHandler', 40, 1), ('HttpResponseParser', 40, 1), ('PayloadAccessError', 3, 1)]
2022-04-10 14:18:23 CRITICAL (SyncWorker_9) [homeassistant.components.profiler] Memory Growth: [('weakref', 26602, 19), ('DNSNsec', 16, 3), ('DNSAddress', 12, 2), ('State', 776, 1)]
2022-04-10 14:18:53 CRITICAL (SyncWorker_10) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:19:23 CRITICAL (SyncWorker_2) [homeassistant.components.profiler] Memory Growth: [('weakref', 26606, 4), ('function', 99430, 1)]
2022-04-10 14:19:53 CRITICAL (SyncWorker_8) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:20:23 CRITICAL (SyncWorker_8) [homeassistant.components.profiler] Memory Growth: [('function', 99434, 4), ('WeakKeyDictionary', 132, 4), ('DNSPointer', 46, 4), ('set', 7456, 1), ('SessionTransaction', 1, 1)]
2022-04-10 14:20:53 CRITICAL (SyncWorker_14) [homeassistant.components.profiler] Memory Growth: [('cell', 59163, 52), ('function', 99452, 18), ('_FilterableJob', 174, 9), ('HassJob', 1611, 9), ('Queue', 10, 1), ('FlowControlDataQueue', 68, 1), ('WebSocketWriter', 68, 1), ('ActiveConnection', 7, 1), ('WebSocketHandler', 7, 1), ('WebSocketAdapter', 7, 1), ('WebSocketResponse', 37, 1)]
2022-04-10 14:21:23 CRITICAL (SyncWorker_0) [homeassistant.components.profiler] Memory Growth: [('set', 7468, 12), ('DNSQuestion', 16, 11), ('function', 99454, 2), ('cell', 59165, 2)]
2022-04-10 14:21:53 CRITICAL (SyncWorker_1) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:22:23 CRITICAL (SyncWorker_10) [homeassistant.components.profiler] Memory Growth: [('tuple', 89684, 4), ('function', 99456, 2), ('cell', 59167, 2), ('deque', 905, 1), ('DNSAddress', 13, 1), ('DNSNsec', 17, 1)]
2022-04-10 14:22:53 CRITICAL (SyncWorker_0) [homeassistant.components.profiler] Memory Growth: [('tuple', 89689, 5), ('deque', 908, 3), ('Packet', 42, 1), ('ConnectionKey', 36, 1), ('ResponseHandler', 41, 1)]
2022-04-10 14:23:23 CRITICAL (SyncWorker_1) [homeassistant.components.profiler] Memory Growth: [('function', 99458, 2), ('cell', 59169, 2)]
2022-04-10 14:23:53 CRITICAL (SyncWorker_12) [homeassistant.components.profiler] Memory Growth: [('ReadOnlyDict', 1849, 111), ('dict', 99290, 9), ('tuple', 89693, 4), ('ReceiveMessage', 957, 3), ('function', 99460, 2), ('cell', 59171, 2), ('deque', 909, 1), ('socket', 127, 1), ('TransportSocket', 102, 1), ('CIMultiDict', 160, 1), ('_SelectorSocketTransport', 87, 1)]
2022-04-10 14:24:23 CRITICAL (SyncWorker_16) [homeassistant.components.profiler] Memory Growth: [('hamt', 79, 1), ('hamt_bitmap_node', 80, 1)]
2022-04-10 14:24:53 CRITICAL (SyncWorker_4) [homeassistant.components.profiler] Memory Growth: [('function', 99462, 2), ('cell', 59173, 2)]
2022-04-10 14:25:23 CRITICAL (SyncWorker_14) [homeassistant.components.profiler] Memory Growth: [('weakref', 26649, 43)]
2022-04-10 14:25:53 CRITICAL (SyncWorker_16) [homeassistant.components.profiler] Memory Growth: [('function', 99464, 2), ('cell', 59175, 2)]
2022-04-10 14:26:23 CRITICAL (SyncWorker_16) [homeassistant.components.profiler] Memory Growth: [('Part', 248, 4), ('weakref', 26650, 1)]
2022-04-10 14:26:53 CRITICAL (SyncWorker_10) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:27:23 CRITICAL (SyncWorker_1) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:27:53 CRITICAL (SyncWorker_5) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:28:23 CRITICAL (SyncWorker_5) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:28:53 CRITICAL (SyncWorker_13) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:29:23 CRITICAL (SyncWorker_6) [homeassistant.components.profiler] Memory Growth: []
2022-04-10 14:29:53 CRITICAL (SyncWorker_5) [homeassistant.components.profiler] Memory Growth: [('State', 783, 7)]
2022-04-10 14:30:23 CRITICAL (SyncWorker_13) [homeassistant.components.profiler] Memory Growth: [('State', 797, 14)]
`
balloob commented 2 years ago

@woopsicle could you update your comment by replacing the content of the files and instead drag the files in so they become uploads. It will keep the thread readable, thanks

silviudc commented 2 years ago

home-assistant.log Please see attached for my log.1 file, my issue is that HA eventually hangs (or maybe just the supervisor) and I have to do power off/on to get everything working again. Happens 1-2 times a day at random times. Only started in 2022.4

malosaa commented 2 years ago

Its weird, on my main system its between 4-15% cpu usage with a lot of add-ons

But on my test system, with very less its between 30-50% and getting out of memory in console. I dont know why..

Running Hass OS My test system is running 2022. 4.0 And my main system is on 2022 4,1

bdraco commented 2 years ago

If you have trouble installing py-spy, you can download it from here https://github.com/benfred/py-spy/releases/tag/v0.3.11

The .whl files can be extracted with the unzip command.

bdraco commented 2 years ago

I have attached the heap profile log, profile.start output and a ~15 min sample of running start_log_object below. @woopsicle

Looks like the two high memory hits in yours are:

.r: 508 23736464 homeassistant.components.stream.core.Part .r: 8 3298197 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType

There are a lot of call in async_upnp_client as well so it would be good to get debug logs for that as well Edit: It looks like this one could be optimized so I opened https://github.com/StevenLooman/async_upnp_client/pull/132

Screen Shot 2022-04-09 at 22 58 05
malosaa commented 2 years ago

Well i tested it out. I have almost nothing logging in my main server only the things i need and no high cpu or memory issue.

But on my other server it was sometimes really high, so i decided to exclude and leave first 1 sensor recording and my issue magically disappeared I did delete the hsd file before restarting.

image

Now and then there is a little spike but thats from the scraping from multiscrape addon...

So maybe it's a issue related to Recorder i don't know...

silviudc commented 2 years ago

Maybe some high memory usage but not sure from where. Running HA OS on a 2gb pi4 image image

In red is when I power cycled the pi4

mib1185 commented 2 years ago

@silviudc please have a look at https://github.com/home-assistant/core/issues/69695#issuecomment-1094186060

silviudc commented 2 years ago

@silviudc please have a look at #69695 (comment)

Installed now and got the .conf and callgrind.out files. Which one to upload or both?

woopsicle commented 2 years ago

There are a lot of call in async_upnp_client as well so it would be good to get debug logs for that as well Edit: It looks like this one could be optimized so I opened StevenLooman/async_upnp_client#132 Screen Shot 2022-04-09 at 22 58 05

happy to do that, pls let me know how!

woopsicle commented 2 years ago

py-spy output attached py-spy py-spy.svg.zip .

McGiverGim commented 2 years ago

I've a very similar memory profile to @silviudc since update my HA. I'm trying to give you the profile files you need.

I've tried to install the py-spy in my HAOS server, but without luck, it fails building dependencies:

      Building wheels for collected packages: maturin
        Building wheel for maturin (pyproject.toml): started
        Building wheel for maturin (pyproject.toml): finished with status 'error'
        error: subprocess-exited-with-error

        × Building wheel for maturin (pyproject.toml) did not run successfully.
        │ exit code: 1
        ╰─> [35 lines of output]
            running bdist_wheel
            running build
            running install
            Traceback (most recent call last):
              File "/tmp/tmpw3pvz9mt_in_process.py", line 363, in <module>
                main()
              File "/tmp/tmpw3pvz9mt_in_process.py", line 345, in main
                json_out['return_val'] = hook(**hook_input['kwargs'])
              File "/tmp/tmpw3pvz9mt_in_process.py", line 261, in build_wheel
                return _build_backend().build_wheel(wheel_directory, config_settings,
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 216, in build_wheel
                return self._build_with_temp_dir(['bdist_wheel'], '.whl',
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 202, in _build_with_temp_dir
                self.run_setup()
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 145, in run_setup
                exec(compile(code, __file__, 'exec'), locals())
              File "setup.py", line 106, in <module>
                setup(
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 153, in setup
                return distutils.core.setup(**attrs)
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 148, in setup
                dist.run_commands()
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
                self.run_command(cmd)
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
                cmd_obj.run()
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/wheel/bdist_wheel.py", line 335, in run
                self.run_command('install')
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
                self.distribution.run_command(command)
              File "/tmp/pip-build-env-em_r1q6q/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
                cmd_obj.run()
              File "setup.py", line 58, in run
                raise RuntimeError(
            RuntimeError: cargo not found in PATH. Please install rust (https://www.rust-lang.org/tools/install) and try again
            [end of output]

        note: This error originates from a subprocess, and is likely not a problem with pip.
        ERROR: Failed building wheel for maturin
      Failed to build maturin
      ERROR: Could not build wheels for maturin, which is required to install pyproject.toml-based projects

So I use the profiler integration. I was not sure what service must I use. This is a 5 minutes profiler.start:

profile.zip

And this is a 5 minutes profiler.memory:

heap_profile.1649594769644313.zip

If you need other or a more long log please tell me. The leak seems a slow leak so I don't know if you could see something in a 5 minutes log.

silviudc commented 2 years ago

Yeah for sure runs out of memory, I looked back 2 days and it flatlines at 1.7gb when the ui doesn't respond anymore and can't even ssh into it image

silviudc commented 2 years ago

Here's a 5 min profiler:memory service dump heap_profile.1649597249326648.zip

vondruska commented 2 years ago

RPi4, 2GB using Home Assistant OS with a couple addons.

Upgraded to 2022.4.1 on 2022-04-08. 5 minutes of profiler.start and profiler.memory attached to this comment in a zip file. Happy to gather more stats/data/profiles if needed. homeassistant-profile.zip.

Memory usage over time: Screenshot 2022-04-10 10 13 14 AM

CPU does not appear to be impacted. The spikes coincide with the upgrade to 2022.4.1 and the crash and subsequent restart earlier today: image

kolossboss commented 2 years ago

I have the same issue with Intel NUC hardware ( 8 GB RAM ) running HA OS. I also see this constantly increasing of the memory since 2022.4.x. No crash so far, but server restarts also decreases the memory. A0F3F4B9-32A5-4917-8C0C-E3505516417A

firstof9 commented 2 years ago

profile.1649605374192318.cprof.zip Here's mine with a 5 min length.

bdraco commented 2 years ago

@McGiverGim

Looks like the stream integration on your setup

.r: 126 5698865 homeassistant.components.stream.core.Part
.r: 2 735329 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType

A py-spy is likely needed as well. Please see https://github.com/home-assistant/core/issues/69695#issuecomment-1094222398

bdraco commented 2 years ago

@woopsicle Looks like the stream integration is using a lot of cpu time on yours

bdraco commented 2 years ago

@silviudc

.r: 1004 38213937 homeassistant.components.stream.core.Part
.r: 12 6624880 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType

Looks like stream is the top hitter for you as well

bdraco commented 2 years ago

@firstof9 Can you get a py-spy as well? https://github.com/home-assistant/core/issues/69695#issuecomment-1094222398

firstof9 commented 2 years ago

@bdraco here you go py-spy

bdraco commented 2 years ago

@firstof9 Thanks. Can you upload it as a zip? github converts the svg and much of the data is lost.

silviudc commented 2 years ago

Was anything changed with camera streams in 2022.4? I haven't made any changes on my end since 2022.3.x

bdraco commented 2 years ago

I enabled preload stream on my system with the hopes it would leak but nothing so far.

I have unifi protect cameras so if it is the stream integration causing the issue it may be limited to specific cameras

Can you post a bit about your camera setup?

bdraco commented 2 years ago

Was anything changed with camera streams in 2022.4? I haven't made any changes on my end since 2022.3.x

There were a number of changes to stream in 2022.4, including an upgrade of the av library, but I don't see anything obvious that would create a memory leak unless it's in the av library itself (doesn't appear to be based on the memory profiles but since I don't have a system that exhibits the behavior it's hard to tell for sure).

silviudc commented 2 years ago

Cameras I use are Annke ones using rtsp. config.yaml:

- platform: ffmpeg
  name: Front Left
  input: !secret front_camera_left
- platform: ffmpeg
  name: Front Left Low Q
  input: !secret front_camera_left_low_q
- platform: ffmpeg
  name: Front Center
  input: !secret front_camera_center
- platform: ffmpeg
  name: Front Center Low Q
  input: !secret front_camera_center_low_q
- platform: ffmpeg
  name: Front Right
  input: !secret front_camera_right
- platform: ffmpeg
  name: Front Right Low Q
  input: !secret front_camera_right_low_q
- platform: ffmpeg
  name: Kitchen Door
  input: !secret kitchen_door_camera
- platform: ffmpeg
  name: Kitchen Door Low Q
  input: !secret kitchen_door_camera_low_q
- platform: ffmpeg
  name: Kitchen Window
  input: !secret kitchen_window_camera
- platform: ffmpeg
  name: Kitchen Window Low Q
  input: !secret kitchen_window_camera_low_q

And then they would get setup with:

front_camera_right: -rtsp_transport tcp -i rtsp://USER:PW@192.168.0.xx:554/Streaming/Channels/101
front_camera_right_low_q: -rtsp_transport tcp -i rtsp://USER:PW@192.168.0.xx:554/Streaming/Channels/102

And I do see errors in the main .log file about them it seems:

2022-04-10 14:40:09 ERROR (stream_worker) [root] Uncaught thread exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/src/homeassistant/homeassistant/components/stream/__init__.py", line 338, in _run_worker
    stream_worker(
  File "/usr/src/homeassistant/homeassistant/components/stream/worker.py", line 538, in stream_worker
    muxer.mux_packet(first_keyframe)
  File "/usr/src/homeassistant/homeassistant/components/stream/worker.py", line 233, in mux_packet
    self._av_output.mux(packet)
  File "av/container/output.pyx", line 204, in av.container.output.OutputContainer.mux
  File "av/container/output.pyx", line 210, in av.container.output.OutputContainer.mux_one
  File "av/container/output.pyx", line 180, in av.container.output.OutputContainer.start_encoding
  File "av/container/core.pyx", line 257, in av.container.core.Container.err_check
  File "av/error.pyx", line 336, in av.error.err_check
av.error.ValueError: [Errno 22] Invalid argument: '<none>'; last error log: [mp4] dimensions not set
2022-04-10 14:40:16 WARNING (MainThread) [custom_components.meross_lan] MerossDevice(19010849682057251a1334298f1469fc) has incorrect timestamp: 6 seconds behind HA
2022-04-10 14:40:26 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/config/custom_components/eufy_vacuum/tuya.py", line 551, in _async_handle_message
    response_data = await self.reader.readuntil(MAGIC_SUFFIX_BYTES)
  File "/usr/local/lib/python3.9/asyncio/streams.py", line 629, in readuntil
    raise exceptions.IncompleteReadError(chunk, None)
asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of undefined expected bytes
bdraco commented 2 years ago

Which model Annke cameras are you using? I'll try to get ahold of the specific model for testing

silviudc commented 2 years ago

C500 model, 8 of them. https://www.aliexpress.com/item/1005002597313752.html?spm=a2g0s.12269583.0.0.61af3986lL0oak

mhageraats commented 2 years ago

Raspberry Pi 3 Model B 1GB Hangs/Reboots after 2022.4 update

image profiler.zip

bdraco commented 2 years ago

@mhageraats

.r: 194 30156521 homeassistant.components.stream.core.Part
.r: 3 3607932 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType
.r: 204 30564395 homeassistant.components.stream.core.Part
.r: 3 3494543 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType

Looks like stream as well

mhageraats commented 2 years ago

@mhageraats

.r: 194 30156521 homeassistant.components.stream.core.Part
.r: 3 3607932 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType
.r: 204 30564395 homeassistant.components.stream.core.Part
.r: 3 3494543 av.container.output.OutputContainer, dict of homeassistant.components.stream.worker.StreamMuxer, types.BuiltinMethodType

Looks like stream as well

I'm using 2 Foscams via the Foscam integration. And one via ffmpeg.

drthanwho commented 2 years ago

Here's mine as well. From the looks of it here I'm guessing it's probably stream on mine as well heap_profile.1649597557635687.hpy.zip

Camera wise apart form MJPEG and Generic platform any other cameras are comming in from Frigate integration

DOWIT-JoelFrojmowicz commented 2 years ago

I think the problem regarding memory leak in 2022.4 is not related to any addon but only with HA.

I was running 2022.3.8 without any problems. As sson I upgraded to 2022.4.0 I've noticed the memory leak. Then, I've restored a backup with version 2022.3.8 and upgraded all my addons. The memory leak disapeared.

As soon as I upgrade again to 2022.4.1, the memory leak starts again.

image

bdraco commented 2 years ago

Can anyone confirm disabling the cameras solves the leak?

mhageraats commented 2 years ago

Will give it a try

firstof9 commented 2 years ago

Can you upload it as a zip?

Sure thing man, here you go.

py-spy.svg.zip

DOWIT-JoelFrojmowicz commented 2 years ago

Can anyone confirm disabling the cameras solves the leak?

Sure, All cameras disabled, HA reestarted. I'll post again tomorrow with my findings.

dandomin commented 2 years ago

I disabled Stream a few hours ago from my config and restarted. Memory is continuing to grow.

Next, I'll try disabling the cameras as a test.

bdraco commented 2 years ago

@firstof9 That one points to stream as well

Also it looks like you are using https://github.com/ualex73/monitor_docker which has an interesting design choice of starting multiple asyncio event loops in a new thread for every instance its monitoring instead of using the main existing loop. I don't think its the cause of the leak though

firstof9 commented 2 years ago

Thanks for the tip @bdraco, removing stream: from my configuration.yaml should potentially stop the leak right?

bdraco commented 2 years ago

Thanks for the tip @bdraco, removing stream: from my configuration.yaml should potentially stop the leak right?

stream is brought in by default_config: so you'd have to disable that an manually add each one instead