Polling issue and instabilities

Tazintosh commented 1 year ago

Hi folks,

I'm using v0.5.12 to handle two Xiaomi devices: "Mi Air Purifier 3H" and a "Mi Air Purifier Pro". In Node-RED, I'm also using NRCHKB to create virtual HomeKit Air Purifiers but also a Thermostat.

We'll focus on thermostat here. Here's the way I'm using it: • Mi Air Purifier Pro act as a temperature sensor in the Living Room and node-red-contrib-miio-localdevices should poll its status so I can get reported of the current temperature. • Temperature is passed to the thermostat and I'm applying some functions to turn on/off a switch (Shelly Plus 1) physically connected to my gaz heater when the virtual HomeKit Thermostat reach the given temperature.

Sadly the polling does not work, or when it does, it's not reliable and will fail somehow. I'm suspecting the device itself to hang if receiving too much messages, but I took care of this by using RBE in all "Air Purifier In/Out" subflows. But that's not all I'm pretty sure. Those subflows are simple function to switch outputs. For instance, unplugging for more than a minute an air purifier should be vastly enough to clean its memory, but it's not polling either after being plugged back. I also need to restart Node-RED, but then, only one unique poll will pass (can see on the debug message window). And most of the time, absolutely no poll will pass, which obviously makes my heater, well, no heating :)

Any idea? Thanks in advance!

Capture d’écran 2022-11-26 à 19 24 16

EDIT: before you asked, I fixed my mistake: On far right of the screenshot, I'm now using two "link out" nodes (one for each) rather than one. However, issues remains. Restarted both Purifiers, Node-RED, and just got one poll of each. BTW, my auto-polling settings is 120s.

Tazintosh commented 1 year ago

Wait I got a second polling of the same device, but exactly 6 minutes later! (still only one from the other) Why does the 120s polling have no effect? Capture d’écran 2022-11-26 à 20 55 27 Please note that I'm getting the debug message right after the "Get Data", so there is no RBE that could filter out some messages.

stason325 commented 1 year ago

Wait I got a second polling of the same device, but exactly 6 minutes later! (still only one from the other)

Why does the 120s polling have no effect?

Please note that I'm getting the debug message right after the "Get Data", so there is no RBE that could filter out some messages.

Hi there. Concerning this: as it's written in readme starting from 0.3.0 if auto-polling is turned on, GET-node sends JSON with actual characteristics only if these characteristics have changed

I think, your device gave first change in some property only 6 minutes after previous message

Tazintosh commented 1 year ago

Hi there. Concerning this: as it's written in readme starting from 0.3.0 if auto-polling is turned on, GET-node sends JSON with actual characteristics only if these characteristics have changed

I think, your device gave first change in some property only 6 minutes after previous message

Ok, lets consider this :) Thank you for the reminder!

Tazintosh commented 1 year ago

Hi @stason325 Same this is morming… house's only 16°C, no polling for the full night. Restarted + Unplugged all, got one polling at 9:13 for 16.5°C. It's 11:51 here, no new polling, even if the house did warmed up. So now, my trigger should have gone off, but the heater is still running, because last temperature he knows is 16.5°C :(

BTW, I'm running: Node-RED version: v3.0.0-beta.2 Node.js version: v18.11.0 Darwin 20.6.0 arm64 LE (Mac mini M1)

stason325 commented 1 year ago

Hi, I suspect that the problem here is because your device hangs up for some reason and stops responding from certain moment of time. To resolve that I think you need to modify your flow somehow. First of all you can get rid of these two link-ins (https://github.com/stason325/node-red-contrib-miio-localdevices#get-node - please see item 4). Then, show what is inside the Air Purifier In subflow node (blue circle).

204105985-4bdc5207-1030-4c64-9f97-4a0023ea89ee

Regards

Tazintosh commented 1 year ago

Hi @stason325

If I remove the two "link in", then I remove also the two "link out" no? I did this because such a feedback was provided in your exemple: So it's not needed anymore?

Anyway, here's the content of "Air Purifier In": Capture d’écran 2022-11-27 à 14 51 17

stason325 commented 1 year ago

Hi @stason325

If I remove the two "link in", then I remove also the two "link out" no? I did this because such a feedback was provided in your exemple: So it's not needed anymore?

Yes, I've uploaded this example long time ago and did not update it. Concerning link-out it depends on whether you use it somewhere else

stason325 commented 1 year ago

Anyway, here's the content of "Air Purifier In":

nothing strange here... rbes should do their job.

Could you show the same way purifier out subflow?

Tazintosh commented 1 year ago

Here is it: Capture d’écran 2022-11-27 à 16 08 50

Tazintosh commented 1 year ago

Recently (since the changes I made on the nodes), I'm gettin a payload {undefined: 0} when I restart Node-RED. Here's the new flow: Capture d’écran 2022-11-27 à 20 38 28

Oh BTW, since our discussion, I was not using the Air Purifier with HomeKit (while setup on my flow). I just tried out and got various errors: "Mihome Exception. IP: 10.80.10.65 -> send EHOSTDOWN 10.80.10.65:54321" or "Mihome Exception. IP: 10.80.10.65 -> device.setFavSpeed is not a function" or "Mihome Exception. IP: 10.80.10.65 -> auto is not defined" or "Mihome Exception. IP: 10.80.10.65 -> favorite is not defined"

stason325 commented 1 year ago

Recently (since the changes I made on the nodes), I'm gettin a payload {undefined: 0} when I restart Node-RED.

Here's the new flow:

Oh BTW, since our discussion, I was not using the Air Purifier with HomeKit (while setup on my flow).

I just tried out and got various errors:

"Mihome Exception. IP: 10.80.10.65 -> send EHOSTDOWN 10.80.10.65:54321"

or

"Mihome Exception. IP: 10.80.10.65 -> device.setFavSpeed is not a function"

or

"Mihome Exception. IP: 10.80.10.65 -> auto is not defined"

or

"Mihome Exception. IP: 10.80.10.65 -> favorite is not defined"

All that looks very strange. All that errors tell that there is no connection to the device and as a result all that properties are not defined. Some of that (EHOSTDOWN for example) I see for the first time.

Nothing to advice from my side at the moment but to reconfigure your device and your flow from very beginning.

Tazintosh commented 1 year ago

From what I understand since V0.3.0, if I inject false (purifier running), then at least characteristic should have changed, so it should poll. Same if I then click true? Well, I'm getting nothing. Capture d’écran 2022-11-27 à 22 44 12

stason325 commented 1 year ago

From what I understand since V0.3.0, if I inject false (purifier running), then at least characteristic should have changed, so it should poll. Same if I then click true?

Well, I'm getting nothing.

Yes, since 0.3.0 right after each command sent get-node gives you immediate feedback with new json with latest properties.

Tazintosh commented 1 year ago

My mistake, I did had a polling, I simply always forget to wait for the auto-poll delay, I can insure you it's not an "immediate feedback". Boy, being used to MQTT "instantness", it's really hard to go back to something absolutely not live.

I've let pass the night and polling stopped at 3:01am. I had to unplug/replug the purifier at 9am cause the temperature was down to 17°.

stason325 commented 1 year ago

My mistake, I did had a polling, I simply always forget to wait for the auto-poll delay, I can insure you it's not an "immediate feedback". Boy, being used to MQTT "instantness", it's really hard to go back to something absolutely not live.

I've let pass the night and polling stopped at 3:01am. I had to unplug/replug the purifier at 9am cause the temperature was down to 17°.

Hello there. Any progress on that issue? Have your overcome your purifiers' freezes?

Tazintosh commented 1 year ago

Looks like the polling is working now. But the nodes on my flow to control the purifier are actually not in use. So for now it's fine, but I'm afraid the device would freeze the day I'll start sending more commands. I'm in fact just reading the temperature for now.

stason325 commented 1 year ago

Looks like the polling is working now. But the nodes on my flow to control the purifier are actually not in use. So for now it's fine, but I'm afraid the device would freeze the day I'll start sending more commands. I'm in fact just reading the temperature for now.

Got it. And what nodes do you use to control it instead of mine?

Tazintosh commented 1 year ago

Got it. And what nodes do you use to control it instead of mine?

None, I'm not controlling it at the moment. My flow will stay most probably as my previous captures: ready to work with your nodes.

stason325 commented 1 year ago

Got it. And what nodes do you use to control it instead of mine?

None, I'm not controlling it at the moment. My flow will stay most probably as my previous captures: ready to work with your nodes.

ok, but I don't quite 100% understand whether I need to make some changes to my nodes...

Tazintosh commented 1 year ago

Well, me neither ;D I guess the fact you told me to remove the loopback might have helped preventing the purifier to freeze. About the polling, the "issue" might in fact just be fixed with a semantic/UI update. We are used to the fact that polling each n seconds, will actually provides a result no matter what. But it's not what's happening, the node only provides a result —if— a value have change during this interval right? Which makes it less obvious than the way we are used to.

stason325 commented 1 year ago

We are used to the fact that polling each n seconds, will actually provides a result no matter what. But it's not what's happening, the node only provides a result —if— a value have change during this interval right? Which makes it less obvious than the way we are used to.

yes, that was the idea - to get the list of props only if something have change there instead of receiving a queue of identical messages ☺️

Tazintosh commented 1 year ago

I fully understand it, but to test a given code in real situation (I mean not by using inject etc.), it's kind of a nightmare: take my temperature exemple, if the temperature is stable, I could wait hours to receive a trigger. Non speaking the fact, this also prevent knowing if the device is actually frozen or not.

stason325 commented 1 year ago

I fully understand it, but to test a given code in real situation (I mean not by using inject etc.), it's kind of a nightmare: take my temperature exemple, if the temperature is stable, I could wait hours to receive a trigger.

Non speaking the fact, this also prevent knowing if the device is actually frozen or not.

I understand your concern 🤷‍♂️ What do you propose here? ...please give me direction at least

Tazintosh commented 1 year ago

You could provide a checkbox like: "Send message only if payload is different" next to the polling options (or the other way around). Or the Autopolling checkbox could also be fully replaced by a dropdown with the following options: • Disabled • Enabled, only on change • Enabled, at each interval (always) When enable, the polling interval input would become visible or editable.

stason325 commented 1 year ago

You could provide a checkbox like: "Send message only if payload is different" next to the polling options (or the other way around). Or the Autopolling checkbox could also be fully replaced by a dropdown with the following options: • Disabled • Enabled, only on change • Enabled, at each interval (always) When enable, the polling interval input would become visible or editable.

it makes sense 👍🏻 I think I will implement this in the next 0.5.13

Tazintosh commented 1 year ago

Great! Thank you for your consideration.

Tazintosh commented 1 year ago

Hi,

I'm following up on this issue because I'm convinced something's still going wrong and I think I found at least one reason.

• My Office purifier (mb3) keeps crashing over and over… (busy) • When it's not crashed, it will report values, but the amount of log makes no sense based on your documentation (same for v7): Rather then getting one log each time a value changes, I'm getting many of them (see bellow capture) on a row, exact same time. If I compare the values, they are absolutely identical. • While mb3 is reporting, I'll get —no— report from my other purifier (v7). • When mb3 will crash, I'll start getting reports from v7.

--> Looks like the two are not happy to live together. --> Multiple logs reports have to do with the way I'm deploying: I'm used to deploy "modified flows", and looks like this is creating a huge mess with your palette. Like if it was creating another instance each time I click "deploy". However, when I "full" deploy, reports are back to one only.

Hope that's help.

Screenshot 2023-01-23 at 12 22 58

stason325 commented 1 year ago

well... there are several points here. 1) I personally use my nodes with several devices and it all works with no problem 2) (it is critical) why are you deploying the nodes that haven't changed? It is sort of contradicts nodered's logic a bit. You need to know that after full deploy rbe-nodes' data could be cleared and you start all your flows from very beginning => you overpush your devices with numerous requests at one moment in time => device could freeze and could not respond any more

what I can propose here: a) use context variables (flow.set & flow.get) instead of rbe and check for changes in this variables - this can prevent looping your flow b) do not deploy nodes unless you haven't make some changes in them c) try custom-json send-node instead of separate nodes for each command (commands are executed asynchronously inside it)

Anyway you need to reconsider your flow a bit to make it safer in terms of reboots/full deploys etc.

Regards

Tazintosh commented 1 year ago

Didn't I said the opposite of your point 2. ? I was used to never "full" deploy, and had issues. But when I full deploy all is working better.

b) do not deploy nodes unless you haven't make some changes in them You mean "unless you have" no?

stason325 commented 1 year ago

Didn't I said the opposite of your point 2. ?

I was used to never "full" deploy, and had issues. But when I full deploy all is working better.

b) do not deploy nodes unless you haven't make some changes in them

You mean "unless you have" no?

Got it. Ok👍🏻

Try to use context variables instead of rbes ... your flow could loop somewhere as variant

Tazintosh commented 1 year ago

I've finally thought about fully reseting my Air Purifier 3H. After setting in back (token…), turns out all is working way better (no crash yet). I've no idea what triggered it to go so wrong, but I hope it won't happen again. Requests and feedback are super slow though (can't wait your code update to prevent the devices from relying from mi server).

Tazintosh commented 1 year ago

Well… spoke to fast, the device just hanged… I'm sure it has issues handling too much(?) requests

stason325 / node-red-contrib-miio-localdevices

Polling issue and instabilities #15