yadomi / node-red-contrib-philipshue-events

Implements the Philips Hue API v2 EventSource as a node-red node
MIT License
13 stars 2 forks source link

randomly stops receiving events #7

Closed marc-gist closed 2 years ago

marc-gist commented 2 years ago

I am having issues where randomly your node stops receiving events from the hub and NodeRed must be restarted. Is there a way for you to implement an input that would restart/reset the connection to the hub?

Thanks!

andesse commented 2 years ago

@marc-gist @yadomi I haven’t this issue yet.

Marc, there is another way to do this, with a sse-client node. This has the possibility to manually restart / stop the event-flow. I think it’s possible to catch disconnects / errors from the node that automatically cause a restart. Do you want to try this out? I didn’t tried yet, cause yadomis node works for me, but I have the flow.

Flow in the next comment

andesse commented 2 years ago

@marc-gist

[{"id":"af40bef6.327ce","type":"tab","label":"HUE STREAM","disabled":false,"info":""},{"id":"8ed3a90d.0b3368","type":"sse-client","z":"af40bef6.327ce","d":true,"name":"Hue API v2","url":"https://192.168.0.40/eventstream/clip/v2","events":[],"headers":{},"proxy":"","restart":true,"rejectUnauthorized":false,"withCredentials":true,"timeout":"10","x":1510,"y":760,"wires":[["6c442db9.6bc654","bfc4369a.138aa8"]]},{"id":"a9d00346.c2cc7","type":"debug","z":"af40bef6.327ce","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","statusVal":"","statusType":"auto","x":1950,"y":760,"wires":[]},{"id":"118b4c6e.005fc4","type":"inject","z":"af40bef6.327ce","name":"(Re-)start stream","props":[{"p":"ip","v":"192.168.0.40","vt":"str"},{"p":"headers","v":"{\"hue-application-key\":\"eNCdp8lxYfUohrLKhW35sBpYQ123C2GMeP6Ludf4\"}","vt":"json"}],"repeat":"","crontab":"","once":true,"onceDelay":"0.7","topic":"","x":1250,"y":760,"wires":[["8ed3a90d.0b3368","8ed3a90d.0b3368"]]},{"id":"7651419c.9e53f","type":"inject","z":"af40bef6.327ce","name":"Stop stream","props":[{"p":"stop","v":"true","vt":"bool"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payloadType":"str","x":1230,"y":900,"wires":[["8ed3a90d.0b3368","8ed3a90d.0b3368"]]},{"id":"c6793143.7fa55","type":"inject","z":"af40bef6.327ce","name":"Pause stream","props":[{"p":"pause","v":"true","vt":"bool"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payloadType":"str","x":1230,"y":940,"wires":[["8ed3a90d.0b3368","8ed3a90d.0b3368"]]},{"id":"6c442db9.6bc654","type":"json","z":"af40bef6.327ce","d":true,"name":"","property":"payload","action":"","pretty":false,"x":1635,"y":760,"wires":[["4bbd79c3.b4a2f8","4bbd79c3.b4a2f8"]],"l":false},{"id":"4bbd79c3.b4a2f8","type":"split","z":"af40bef6.327ce","name":"","splt":"\n","spltType":"str","arraySplt":1,"arraySpltType":"len","stream":false,"addname":"id","x":1695,"y":760,"wires":[["7c0978b8.6474c8","7c0978b8.6474c8"]],"l":false},{"id":"7c0978b8.6474c8","type":"change","z":"af40bef6.327ce","name":"","rules":[{"t":"move","p":"payload.data","pt":"msg","to":"payload","tot":"msg"}],"action":"","property":"","from":"","to":"","reg":false,"x":1755,"y":760,"wires":[["567116c4.33a7c8","567116c4.33a7c8"]],"l":false},{"id":"567116c4.33a7c8","type":"split","z":"af40bef6.327ce","name":"","splt":"\n","spltType":"str","arraySplt":1,"arraySpltType":"len","stream":false,"addname":"id","x":1815,"y":760,"wires":[["a9d00346.c2cc7","a9d00346.c2cc7"]],"l":false},{"id":"bfc4369a.138aa8","type":"debug","z":"af40bef6.327ce","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","statusVal":"","statusType":"auto","x":1670,"y":720,"wires":[]},{"id":"25431724.1dab38","type":"comment","z":"af40bef6.327ce","name":"HUE Key & IP here","info":"","x":1250,"y":720,"wires":[]},{"id":"df9ad39f.e5314","type":"comment","z":"af40bef6.327ce","name":"IP here","info":"","x":1510,"y":720,"wires":[]},{"id":"9cbe5201.464d6","type":"comment","z":"af40bef6.327ce","name":"DO NOT DEPLOY WITHOUT THE SSE NODE!","info":"","x":1610,"y":820,"wires":[]}]

INSTALL SSE CLIENT FIRST, IT WORKS PROBABLY WITH THE NR HTTP NODE AS WELL, COULDNT FIGURE OUT HOW. MAYBE YOU CAN GET IT RUNNING, PLEASE SHARE HOW THEN.

IMG_2259

marc-gist commented 2 years ago

@andesse thanks, i'll try that node next. Your code won't import for some strange "token" issue. But I have the SSE running. I just had to convert output to JSON and split up packets that had more than one message in a payload. fingers crossed this works when whatever is causing the messages to stop happens.

andesse commented 2 years ago

@marc-gist

Strange. I check it again why this happened. Cool that it works for you. Do you think its possible to use the built in http node for receiving an event stream? I haven't figured out yet, tried a lot around… do you have an idea?

If that would be possible it would be completely contrib free.

marc-gist commented 2 years ago

@marc-gist

Strange. I check it again why this happened. Cool that it works for you. Do you think its possible to use the built in http node for receiving an event stream? I haven't figured out yet, tried a lot around… do you have an idea?

If that would be possible it would be completely contrib free.

The HTTP node does not hold connections open in the manner that SSE (Server Side Events) require. Unfortunately Hue went with SSE rather than Websockets which has an easier "keep alive" and reconnection checking mechanism.

Achronite commented 2 years ago

I'm also having issues with the event stream. The philipshue-events node just stops after a while with a node status of 'Error' with no further info and no errors in node-red-log.

+1 for auto-reconnect on error

andesse commented 2 years ago

@Achronite The Node needs a fresh Bridge user ID. Should work then.

Achronite commented 2 years ago

I went through a pairing operation so I think it is a using one already(?)

andesse commented 2 years ago

I went through a pairing operation so I think it is a using one already(?)

@Achronite ok, copy the id (save it in a label node), delete the configs, delete the node, drag a new node into the flow, setup a new config, deploy and try.

Achronite commented 2 years ago

@andesse OK, I've done that (actually I did a delete config, redeploy, add node, then a re-pairing and deploy). I'll let you know if the link dies again.

andesse commented 2 years ago

@Achronite if it’s still not working download the SSE flow from my repository. It does the same.

https://github.com/andesse/hue-clip-api.node-red-flows

Achronite commented 2 years ago

So after a couple of days the event emitter has stopped emitting events. This time the node-red GUI the node status is still showing the status 'Listening', but no events are being emitted no matter what I do in node-red or the hue App.

I cant find any errors in node-red-log. Any other things to try and diagnose this before I do a node-red-restart?

andesse commented 2 years ago

@Achronite i just once faced this issue, after that I added a new secret key in the node and since then it’s running fine. 2weeks+
it might be related to a bridge update, did you got one?

Achronite commented 2 years ago

The SSE protocol cannot be relied upon for keeping a connection open. There needs to be a capability to reconnect to the server if the connection is dropped on the (philips) server side or any other error occurs.

I've seen examples that even have in-built timeouts to automatically re-establish a connection if it all goes quiet. The issue we have with the philips events is that, unless you have a motion sensor, events only occur on interaction with a device.

andesse commented 2 years ago

@Achronite

The SSE node has a function to reconnect when the connection is dropped. Is this not working?

With the motion sensor could be the case that I never facing problems, cause I have 6 sensors in a 65 square meter apartment and something is busy all the time and brightness / temperature data.

I could Imagine a solution having 2 (or 3) SSE running with different IDs, where one of them automatically reconnected every 10000ms and the other one every 102594ms (time should vary so strange, that they unlikely reconnect at the same time) Reading both streams that run into a RBE that you don’t get double events.

Could this be a thing?

andesse commented 2 years ago

@Achronite no, that’s not working. All requests get a json parsing error if I try to read the stream two times. There can just be one in node red.

What about restarting the stream every 5 minutes? You could add an Intervall in the inject node.

Achronite commented 2 years ago

I've added a console.log to catch any es.onopen() events in the config node, this should inform of any auto-reconnect events. Non so far, and no further errors / stops since I added the debugs. As is always the case in software development! 🤣

andesse commented 2 years ago

@Achronite can you write the error in context? So it won’t disappear somehow when it appeared. It could be used to modify the flow for the sse. Thanks.

Achronite commented 2 years ago

It looks like the stream just stopped emitting without an en.onerror() or es.onopen() being triggered. I think we need a way of asking the node to reopen the connection maybe on request via an input(?)

The other option would be some sort of monitoring of es.readyState in the config node; it might change when the link dies?

Achronite commented 2 years ago

I've now added code to your node/config that reconnects on an error or on a quiet link (5 mins). I'll run it for a few days to see how it gets on.

Achronite commented 2 years ago

So... after various debug and restart links it seems that the event stream either

  1. Stops with an .onerror clause, the error from an ECONNRESET
  2. The link goes quiet (5mins), on a reconnect it sometimes works 1st time other times it reports a 503 for up to 1hr.

I'm going to take a look at the 503 now, maybe the link is actually still active?

On the positive side, I now lo longer need to start my node-red to get the link working again.

yadomi commented 2 years ago

Hello, I've just updated the package to v1.4.0 that adds a reconnection mechanism on errors.

Hopefully this fix the various dead connection issue as it's not easy to debug the bridge because the error isn't very easy to reproduce.

Let me know if you still have the issue :) (and sorry for the late reply)

andesse commented 2 years ago

@yadomi thanks! Much appreciated. Awesome contrib.

Achronite commented 2 years ago

(Note: I'm still on my modified codebase whilst I nail this!) I had a different onerror last night - this time a socket hang up.

I'm also still getting a lot of inactivity timeouts, I had 45 last night! It seems to happen especially during the night. I only have 1 PIR that is regularly reporting, maybe the light_level / temperature / battery_level is reported less overnight? Or maybe they are only reported if the value changes?

Anyway... I managed to fix the 503 on reconnect by always calling es.close() first. I also let my inactivity timeout do any reconnecting to avoid spamming the bridge.

I've just upped my inactivity timer from 5 to 10 mins. lets see if this makes any difference?

Thanks @andesse by the way - you are the one that brought me here in the first place via your flows :-)

andesse commented 2 years ago

@Achronite buy 5 more motion sensors, I never had an issue. :p Maybe share your code to @yadomi , it could be probably added? I am not of a programmer enough that I could tell.

Felix, did you recognize my repository? It helped a lot of people having issues with HueMagic. it’s a flow example using Http requests and your contrib (or a sse) to run a hue setup.

Achronite commented 2 years ago

Thought it was time for an update on my testing.

I upped my connection timeout to 30 minutes 10 seconds, since then I have had no reconnection timeouts. The variation in events that my HUB receives seems very random from a single PIR. Generally there is at least 1 received every 30 minutes. Hence my chosen timeout value.

Now I'm not sure if some events aren't being emitted or received. But it does seems overly random.

Let me know if you are interested in a pull request that adds in the timeout mechanism.

marc-gist commented 2 years ago

@Achronite thanks for the work and note. I have been having no noticeable issues with the SSE node, with timeout set to 120 seconds. This node appeared to output only one event at a time. Not sure if that was cooincidence, but I did notice that using SSE node, i started to get multiple items in the data object in the payload. Therefore, you can't use msg.payload.data.0.id to look for events. You have to check if msg.payload.data.length > 1 then search them all. I ended up doing a custom function node that broke up the data objects so that my flows using msg.payload.data.0.id = X would work.

andesse commented 2 years ago

@yadomi thanks for all the effort, this is my most important contrib ! I am wondering how big your own hue setup is

yadomi commented 2 years ago

Everything seems to be stable since last release, I'm closing the issue, feel free to open a new issue if you have any other issue 🚀