Closed ivanovd closed 3 years ago
I recommend trying without zigbee2mqttassistant
and using the built-in frontend. Also you could try: https://github.com/zigbee2mqtt/hassio-zigbee2mqtt that uses automatic MQTT discovery and see if that works on restarts.
I uninstalled zigbee2mqttassistant and that didn't make a difference. It still crashes with OOM at least once in every 24 hours:
[58868.246215] [ 8830] 0 8830 113 57 8192 0 0 bc
[58868.246221] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=python3,pid=5770,uid=0
[58868.246412] Out of memory: Killed process 5770 (python3) total-vm:587700kB, anon-rss:396436kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:676kB oom_score_adj:0
[58868.410100] oom_reaper: reaped process 5770 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
I am wondering what can be causing this? MQTT discovery is just a workaround. I'd like to understand why having a lot of ZigBee traffic crashes the home assistant observer.
Any ideas where should I start troubleshooting from?
You would probably get better support if you posted here: https://github.com/koenkk/zigbee2mqtt You could try to setup some parameters to monitor your processes and save those in some influx or so to see how the graphs look when memory is being consumed.
I have no such problems using z2m with more than 20 devices but I'm using HassOS (the beta on RPI4 with USB boot, from SSD).
Yeah, the issue is with RPi3b+ which has only 1 GB of RAM. However, RAM shouldn't be an issue since most of the time the RAM is 99% free and the CPU usage is at 2-3%
Interesting. I've been running my setup on a RPi3b with the same number of devices as on the RPI4 with no problems, but might have switched before the frontend was added to z2m. Could you try with the frontend disabled and see how that behaves?
Okay, I did some troubleshooting. And noticed that the HA frontend reports the RAM usage of the container, not the host. The RAM usage of the host (Raspbian) is about 80-85% and the SWAP was at 100%. I am experimenting now with the size of the swap (I'm on SSD) and will see what's the correlation between the crashes of the observer and the ram/swap usage.
I will keep you updated. Thanks for the ideas and the input.
I have some progress to report. It's been running for more than 20 hours straight with no OOM errors and no hangs. RAM usage is between 65% and 80% and the SWAP is almost always full with some 5% drops from time to time.
Here is what I did:
In the future if I see spikes in RAM usage I will increase the SWAP size to 1 GB instead of the default 100 MB. In my case that is safe since I use SSD HDD, but I do not recommend that for SD Card users.
I think that from 0.106 onwards HA became more memory hungry and therefore the OOM errors, since the default SWAP of Raspbian is only 100 MB. I have found similar problems on other forums with a lot of other users experiencing the same scenario after ver. 0.116.
zigbee2mqtt add-on version (if edge, please report commit hash): 1.16.1
Operating environment (HassOS, Virtual Machine, Device/platform running Home Assistant):
Description of problem:
I have noticed that when I connect 20+ devices to my ZigBee network, this crashes the observer. The watchdog restarts the observer, however, after that the ZigBee2MQTT cannot connect to MQTT, even though MQTT is running. You have to restart the ZigBee2MQTT add-on so that it starts communicating with the MQTT broker. It started happening recently. After upgrading to HA 1.116 I believe. As soon as I drop the number of devices in the ZigBee network, the system is rock solid and no more OOM errors and crashes.
Your entire configuration from the frontend (with sensitive fields redacted):
Your logs from Home Assistant