Closed Jacuo closed 1 year ago
I seem to be having the same problem on Home Assistant 2022.12 and RaspberryMatic add-in 3.65.11.20221005. It completely locks up the HA server, I can only power it off and on again. Problem reoccurs after an hour or two.
Sorry, but the rational regarding these not enough space in buffers.
kernel messages is still the same: these are just the symptoms of an overwhelming busy message communication between the used homematic RF module (e.g. HmIP-RFUSB) and the corresponding processes consuming these messages (e.g. HMIPServer or rfd).
That means, for some reason the RF module is not able to either forward all incoming messages to HMIPServer or rfd or vice versa. And something like this could be related to a too high carrier sense level or because the repetition of communication with the RF module is too high (e.g. because some logic is trying to send out too many messages at a certain time interval) Ergo: Your environment seems to be too busy with having to deal with Homematic communication...
Thank you. Please bear with me a little while I try to understand this problem, as our heating and all automations for lights, etc keep stopping and my family is not happy! I am using a HB-RF-ETH with firmware 1.3.0 and this problem seemed to start after I upgraded HA to 2022.12.0 this morning. Nothing has changed in my setup apart from this. I did notice that the CS was climbing to 10% at times, but I have seen that before without any problems. Where do you suggest I start? I have attached log files in case they are of any use in diagnosing this.
Thanks. messages.txt hmserver.log boot.log
That means, for some reason the RF module is not able to either forward all incoming messages to HMIPServer or rfd or vice versa. And something like this could be related to a too high carrier sense level or because the repetition of communication with the RF module is too high (e.g. because some logic is trying to send out too many messages at a certain time interval) Ergo: Your environment seems to be too busy with having to deal with Homematic communication...
But, at least in my case only rasberrymatic is producing messages - it is a charly device, so it looks that this problem is caused by rasberrymatic ?
That means, for some reason the RF module is not able to either forward all incoming messages to HMIPServer or rfd or vice versa. And something like this could be related to a too high carrier sense level or because the repetition of communication with the RF module is too high (e.g. because some logic is trying to send out too many messages at a certain time interval) Ergo: Your environment seems to be too busy with having to deal with Homematic communication...
But, at least in my case only rasberrymatic is producing messages - it is a charly device, so it looks that this problem is caused by rasberrymatic ?
No, the same would happen if you would use the original CCU3 software on it. This issue (Not enough space in buffers
) is – as I said – only showing the symptoms of either a too busy RF module due to either not being able to transmit fast enough to its target RF devices or due to the interface processes (HMIPServer, rfd, etc.) not dealing with the load fast enough. Thus, reduce your automations and make sure you reduce the load on the RF interface.
I went around and removed the batteries from devices until I saw the Carrier Sense drop, and it seems to have improved. It's been up for a couple of hours now. Looking at the CS values in HA is a lot different that what is displayed on RaspberryMatic. I will experiment more tomorrow. However two issues remain:
I have 0 in RasberryMatic
Just to update this, I decided to move my RaspberryMatic onto a spare TinkerBoard S to keep Home Assistant running. However, Home Assistant continued to run out of memory and crash intermittently. It was caused by a memory leak in a HA module and is fixed in 2022.12.3 this morning. So the memory utilisation errors that the RM add-in was showing were caused by HA, not by Carrier Sense errors. However I still see intermittent CS peaks. I found that the CS indication on RM is not very accurate and using the history in HA gives a better indication. I suspect that I have always had this problem, but was not aware of it. The peaks in the graph below are 3 to 4 minutes apart, but often I do not see them on the RaspberryMatic home page. Because they are so intermittent, they are going to be hard to track down.
However, Home Assistant continued to run out of memory and crash intermittently. It was caused by a memory leak in a HA module and is fixed in 2022.12.3 this morning. So the memory utilisation errors that the RM add-in was showing were caused by HA, not by Carrier Sense errors.
Thanks for coming back here and reporting this. This is actually good news since it really wondered me how the RaspberryMatic HA Add-on could bring down the HA host system since it is a sealed docker container.
However I still see intermittent CS peaks. I found that the CS indication on RM is not very accurate and using the history in HA gives a better indication.
The CarrierSense display on the main WebUI page is only updated once per minute due to performance reasons. Thus, if you need a detailed CS history you will of course have to use external methods to monitor the change in CarrierSense or have a look into the logfiles to get notified by a too high CS value at a time. Nevertheless, if your CS values regularly raise higher than 10% then this is a sign of an RF interference in the near vicinity of the antenna of your homematic RF module. Thus, bring it more far away from any possible RF interferences and also shield it, e.g. by using a proper high-quality USB cable (if you are using a HmIP-RFUSB) and potentially also using a USB RF filter like https://de.elv.com/elv-usb-entstoerfilter-usb-ef1-komplettbausatz-152745
Yes, having an add-in bring down a HAOS server would not be a good feature, which was why I moved it to the TinkerBoard to exclude it. So definitely good news there. Looking at the CS graph again, I'm beginning to suspect something like the refrigerator. The gap between cycles is consistently 3 to 4 minutes so that might be a clue. I'll experiment over the weekend when I have time and report back. Thanks for your help.
monitor the change in CarrierSense
What tools are good to do it ?
I haven't tried this, but it might be worth investigating.
Tahbk you, but I was rather thinking about software to install on Charly to get this info from RPI-RF-MOD and send it somewhere
Describe the issue you are experiencing
HMIPServer cant be restarting.
UI is not responding Device ( Charly ) resatart helps for some time
Describe the behavior you expected
No errors
Steps to reproduce the issue
...
What is the version this bug report is based on?
3.65.8.20220831
Which base platform are you running?
rpi3 (RaspberryPi3)
Which HomeMatic/homematicIP radio module are you using?
RPI-RF-MOD
Anything in the logs that might be useful for us?
Additional information
No response