Closed canedje closed 3 months ago
Looking into it
node-red/node-red/issues/4780
v1.0.1 released.
This address a missing config parameter that was introduced in v1 opening the Protect Node/saving/deploy will fix it.
but v1.0.1 stops it for future users updating to the v1 release
Still crashing. Not as much as before. Now once a day:
23 Jun 21:00:26 - [error] [change:5bb589225a0ed484] Invalid JSONata expression: Unable to cast value to a number: "-3;;f;l"
23 Jun 21:21:08 - [error] [api-call-service:Notify all] InputError: Invalid JSON: {"title":"LICHT:","message":"Licht woonkamer aangezet door lichtsensor.
waarde = 12","data":{"ttl":0,"priority":"high"}}
23 Jun 21:25:26 - [error] [change:5bb589225a0ed484] Invalid JSONata expression: Argument 1 of function "number" does not match function signature
23 Jun 22:55:26 - [error] [change:5bb589225a0ed484] Invalid JSONata expression: Argument 1 of function "number" does not match function signature
23 Jun 23:53:23 - [red] Uncaught Exception:
23 Jun 23:53:23 - [error] Error: WebSocket is not open: readyState 0 (CONNECTING)
at WebSocket.ping (/data/node_modules/ws/lib/websocket.js:361:13)
at SharedProtectWebSocket.<anonymous> (/data/node_modules/node-red-contrib-unifi-os/build/SharedProtectWebSocket.js:74:72)
at Generator.next (<anonymous>)
at fulfilled (/data/node_modules/node-red-contrib-unifi-os/build/SharedProtectWebSocket.js:5:58)
at runNextTicks (node:internal/process/task_queues:60:5)
at listOnTimeout (node:internal/timers:540:9)
at process.processTimers (node:internal/timers:514:7)
24 Jun 05:56:52 - [info]
Strange.
watchDog
that ping
occurs re-starts its self after a pong
& watchDog
only occurs after a connection
some race condition I have missed I guess.
private HEARTBEAT_INTERVAL = 10000
private RECONNECT_TIMEOUT = 15000
private async watchDog(): Promise<void> {
setTimeout(async () => {
await this.updateStatusForNodes(SocketStatus.HEARTBEAT)
this.ws?.ping()
const reconnectTimer = setTimeout(async () => {
await this.updateStatusForNodes(
SocketStatus.RECOVERING_CONNECTION
)
this.disconnect()
this.connect()
}, this.RECONNECT_TIMEOUT)
this.ws?.once('pong', async () => {
clearTimeout(reconnectTimer)
await this.updateStatusForNodes(SocketStatus.CONNECTED)
this.watchDog()
})
}, this.HEARTBEAT_INTERVAL)
}
RECONNECT_TIMEOUT
might have occurred before the pong
was received, and ping/pong should be near instant (or at least 1-2s whilst there is an active connection
i.e pong
took >15s (that should not be the case), and results in a new watchDog
whilst the other was a ticking time bomb that fired at RECONNECT_TIMEOUT
Will try and review it as soon as I can
OK,
I'm going to unsubscribe to a pong
during a re-connect.
To me, it seems for some reason, the pong
was received a lot later than should have been.
so the re-connect started (Connecting) and during this - the late pong
was received (queuing another ping
) whilst still connecting, because the original pong
took it's time to reach us (when it should be only 2-5 seconds)
its all I have for now - its a bit weird
How the watchDog
works
Thanks for the effort
I had the same issue. Will try to upgrade this library. Thanks for the efforts! edit/ noticed this hasn't been released yet.
It's been released as beta. Navigate to nodered install location (usually ~/.node-red
on a pi)
Then run:
npm i node-red-contrib-unifi-os@1.1.0-beta.1
Then restart nodered. I started testing last night, it's working pretty well on nodered 4.0.2 running on my pi.
Nice, trying it out now.
FWIW, the plugin is still regularly crashing my node-red.
17 Jul 17:05:17 - [red] Uncaught Exception:
17 Jul 17:05:17 - [error] TypeError: Cannot read properties of undefined (reading 'toString')
at Unzip.cb (/usr/src/node-red/node_modules/node-red-contrib-unifi-os/build/nodes/Request.js:93:49)
at Unzip.zlibBufferOnError (node:zlib:146:8)
at Unzip.emit (node:events:517:28)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3)
at processTicksAndRejections (node:internal/process/task_queues:82:21)
That's related to an attempt to parse Gzip responses. Clearly, not working...
We have already decided to not attempt it, will remove it in Beta 2
BETA 2 Available
npm i node-red-contrib-unifi-os@1.1.0-beta.2
Hi Marcus I’m on node-red-contrib-unifi-os@1.1.0-beta.2 I’m very sorry, NR crashed again after about 24 hours from start.
Oh For f... S...
That seems out of the contrib code, during web socket handshake its self (the ws module) is this reproducible?
I might have to disable the watchDog
- clearly Unifi doesn't like it.
https://github.com/NRCHKB/node-red-contrib-unifi-os/blob/cdbca0b8282d4e3c37eef801066b42cf9eebd656/src/SharedProtectWebSocket.ts#L189
I can't look at this for the next few days, so if anyone wants to chip in?
We are catching ws error's - so I'm clueless currently. https://github.com/NRCHKB/node-red-contrib-unifi-os/blob/cdbca0b8282d4e3c37eef801066b42cf9eebd656/src/SharedProtectWebSocket.ts#L177
This doesn't make sense. The code seems right. I'm updating the ws to 8.18.0. I'll let you know.
Hi,
made some progress. Removing ws.close()
(and leaving only ws.terminate()
) has improved the uptime to 24 hours. Then, the MaxListenersExceededWarning starts to appear and soon after, the system crashes.
The ws.close is an async function, so one would have to wait for the server's response or timeout.
The ws.terminate instantly terminates the connection instead.
Now i've disabled the this.watchDog() and is up and running for more than 24 hours.
Let's see, but i'm thinking it has to do with timing set in the setIntervals functions, maybe they're too short...
Thanks @Supergiovane , the Watch Dog seems to be the problem here it seems. Without it, the nodes will need to be restarted, if/whenever Protect decides to crash, or reboots (by the user)
@crxporter Would we truly miss recovery? I need to put more thought into the watch dog if so
Hi guys, disabling the this.watchDog, solved the issue. No more NR crashes. As soon as i return home from holiday, i'll take a look at the watchdog.
Would we truly miss recovery? I need to put more thought into the watch dog if so
Generally my pi restarts less often than a unifi update - personally I would appreciate a smooth recovery after protect (or network) software update / reboot / other...
Is there a new beta I can use where it doesn't crash?
Not yet...
Im just too busy with other things at the moment.
as noted by @Supergiovane - commenting out the watchDog
stops any crashing.
Just make sure to comment out the .js file, not the .ts one.
Understood, thanks. I made the change in the built package inside my docker container for now.
Hi pakerfeldt, Can you try this file? Just decompress the zip and put the .js into the build folder of node-red-contrib-unifi-os. SharedProtectWebSocket.js.zip
This should fix the issue. Please let us know. Thanks.
Hi pakerfeldt, Can you try this file? Just decompress the zip and put the .js into the build folder of node-red-contrib-unifi-os. SharedProtectWebSocket.js.zip
This should fix the issue. Please let us know. Thanks.
Sure, I'm running it as of now. Will monitor and see how it works. 👍
Hi @pakerfeldt, please use this file yet. I've done an error handler so errors shouldn't crash NR. SharedProtectWebSocket.js.zip
Sure! 👍
Crashed earlier ...
8 Aug 16:49:31 - [red] Uncaught Exception:
8 Aug 16:49:31 - [error] Error: Unexpected server response: 500
at ClientRequest.<anonymous> (/usr/src/node-red/node_modules/node-red-contrib-unifi-os/node_modules/ws/lib/websocket.js:913:7)
at ClientRequest.emit (node:events:517:28)
at HTTPParser.parserOnIncomingClient (node:_http_client:700:27)
at HTTPParser.parserOnHeadersComplete (node:_http_common:119:17)
at TLSSocket.socketOnData (node:_http_client:541:22)
at TLSSocket.emit (node:events:517:28)
at addChunk (node:internal/streams/readable:368:12)
at readableAddChunk (node:internal/streams/readable:341:9)
at TLSSocket.Readable.push (node:internal/streams/readable:278:10)
at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)
8 Aug 16:49:33 - [info]
Welcome to Node-RED
===================
8 Aug 16:49:33 - [info] Node-RED version: v3.1.11
8 Aug 16:49:33 - [info] Node.js version: v18.20.3
8 Aug 16:49:33 - [info] Linux 5.15.0-117-generic x64 LE
...
Hi, no, it shouldn’t crash. Mine is running smoothly. Tomorrow i’ll check wether i’ve zipped the right file!
Hi @pakerfeldt, please try this: SharedProtectWebSocket.js.zip Remember to put it into the "build" folder of your unifi-os node. I've added some logs, to better catch the error, in case it shows up again. Thanks!
Yeye, it's in the build folder, more specifically /usr/src/node-red/node_modules/node-red-contrib-unifi-os/build
inside my docker container.
050e37f2be05:~/node_modules/node-red-contrib-unifi-os/build$ ls -alFh
total 88K
drwxr-xr-x 1 node-red node-red 4.0K Aug 9 09:09 ./
drwxr-xr-x 1 node-red node-red 4.0K Jul 31 08:54 ../
-rw-r--r-- 1 node-red node-red 618 Jul 31 08:54 Endpoints.js
-rw-r--r-- 1 node-red node-red 6.7K Jul 31 08:54 EventModels.js
-rw-r--r-- 1 node-red node-red 11.9K Aug 9 09:09 SharedProtectWebSocket.js
-rw-r--r-- 1 node-red node-red 10.3K Aug 5 20:24 SharedProtectWebSocket.js.old
-rw-r--r-- 1 node-red node-red 10.3K Aug 7 14:06 SharedProtectWebSocket.js.old2
-rw-r--r-- 1 node-red node-red 10.7K Aug 8 12:12 SharedProtectWebSocket.js.old3
drwxr-xr-x 2 node-red node-red 4.0K Jul 31 08:54 lib/
drwxr-xr-x 3 node-red node-red 4.0K Jul 31 08:54 nodes/
drwxr-xr-x 2 node-red node-red 4.0K Jul 31 08:54 types/
Trying your latest revision as of now.
Seems to be logging stuff.
unifi-os: entered reconnectTimer
unifi-os: private disconnect()
unifi-os: private connect()
unifi-os: private disconnect()
unifi-os: entered heartBeatTimer
unifi-os: entered reconnectTimer
unifi-os: private disconnect()
unifi-os: private connect()
unifi-os: private disconnect()
unifi-os: entered heartBeatTimer
unifi-os: entered reconnectTimer
unifi-os: private disconnect()
unifi-os: private connect()
unifi-os: private disconnect()
unifi-os: entered heartBeatTimer
unifi-os: entered reconnectTimer
unifi-os: private disconnect()
unifi-os: private connect()
unifi-os: private disconnect()
Sorry, please don't shot me. Use this updated file: SharedProtectWebSocket.js.zip
No worries. Swapping to a new version is swift. Done!
@Supergiovane
Thank you for your contribution here 🙏
if you can stop the crashing whilst keeping the watchdog intact - please put up a PR against Dev.
I'm away on my hols currently - so will become more interactive when back.
Hi @marcus-j-davies Sure, will do. Thank you to you for your contrib!
I also want to extend my gratitude to this contrib. Great work! ⭐
Hi PakerFeldt, It’s running good now, isn’t? Mine is rock solid since 5 days.
I was just about to write. Mine has also been stable since the last version I got. Not a single crash 👍
Good. Time to make a PR. Thank you for your help.
Closing as now Beta 3 is Released
Nr is crashing emidiatly after start caused bij the unifi OS pallet. I restarted several times giveing the same error crashing the NR docker. I removed the unifi OS pallet and now NR keeps running NR docker version: 4.0.0 node-red-contrib-unifi-os version: 1.0.0 Error LOG: