HestiaPi / hestia-touch-openhab

OpenHAB2 files for HestiaPi Touch model
GNU General Public License v3.0
60 stars 17 forks source link

Setpoint bouncing #40

Closed gulliverrr closed 4 years ago

gulliverrr commented 4 years ago

Setting Setpoint through LCD (at a normal speed) triggers bouncing between old and new Setpoint and CPU spiking (~4-5). MQTT topic is getting published for every bounce. Stopping openhab and mosquitto stops the effect but restarting them starts the effect again. Rebooting the Pi only resets everything back to normal. Using Basic UI or the phone App, the problem is not present. This was caused after #38 @rkoshak

rkoshak commented 4 years ago

Hmmmm, I've not seen that but I rarely use the LCD touch screen. I'll see if I can reproduce it. From the description it appears there might be an MQTT loop. The touch on the spare version you all sent me I managed to break by not getting the pins lined up when I first turned it on. But I can test on my production one.

What would you define as "normal speed"?

gulliverrr commented 4 years ago

What would you define as "normal speed"? Maybe 0.5 sec? I was not even trying to be quick :)

rkoshak commented 4 years ago

OK, I just wanted to get a reference point. Was that on an RPi 3? I'm not sure it's possible to interact with the LCD that quickly on my RPi 0. I do notice a very significant lag from when I press to when the change registers. This has always been the case for me but maybe something else is wrong with my LCD which is perhaps masking the problem you are seeing...

gulliverrr commented 4 years ago

I actually managed to trigger the bounce from BasicUI too. LCD was on the Fan screen (not showing the setpoint) although this should not really affect anything. Clicked 3 times the down button for Temperature Setpoint with less than a second in between and then the bouncing started. I'm on a Pi Zero W as Pi 3 will not be good for testing these things.

rkoshak commented 4 years ago

I've managed to cause this to happen. I think the problem is in the "Synchronize Temp Proxies" Rule. I don't know exactly why yet but I can see that Rule triggering over and over rapidly when we get in the loop and when I disable the Rule the flapping stops. My theory is if the events come in faster than the Rule can process it the loop happens.

The standard approach would be to change the triggers from "changed" to "received command" and then when updating an Item use postUpdate which will not retrigger the Rule. But I need to verify that all the Items that currently trigger that Rule receive commands.

I've thought about separating the Rule but that won't help. It will just mean that the looping will be caused by more than one Rule instead of just the one Rule.

tl;dr - I think I've pinpointed the location but not the cause of the problem and am still looking into it.

gulliverrr commented 4 years ago

I'm making a new image for development purposes only (named 1.2.0.M1) for anyone else interested to jump in and experience this 10-minute boot time :)

rkoshak commented 4 years ago

OK, I've figured out a work around. I'm not super happy about it but it does stop the flapping and it will due until I can convert over to using Units of Measurement which will let us get rid of a bunch of the proxy Items and this Rule. With UoM we can standardize on just one unit (C) for the Python script and in the Rules. Then all we need is to deal with differences on the sitemap.

The solution is to add a Timer to the "Synchronize Temp Proxies" rule to wait one second before actually synchronizing the proxy Items and then only update the proxy Item if it's different from the current state of the Item that changed. That appears to effectively break the infinite loop.