victronenergy / node-red-contrib-victron

MIT License
87 stars 18 forks source link

Attempt to implement auto reconnect to DBus when connection lost #192

Closed mman closed 1 month ago

mman commented 2 months ago

This PR attempts to address #189 by implementing the following:

  1. I rely on promise-retry library, to actually never resolve() the retry promise, the only way out is actually reject(). That happens on connection timeout, or any connection error (for example when disconnected). The promise-retry will then automatically retry the connection.
  2. I remove the custom 10 second timeout, as connection via dbus-native will eventually timeout via TCP connect with error on its own. In my case after ~ 30 seconds. This is reported upwards and reconnect starts again. Moreover with setTimeout the promise could have actually been reject()ed multiple times, first on 10 sec timeout, then or tcp connect timeout. Probably not a problem in Node JS world.
  3. I changed the minTimeout, maxTimeout, factor so that reconnects happen automatically with exponential backoff between 5 and 60 seconds.

Tested on my system and looks promising, this is what the Cerbo Reboot sequence looks like at Node RED side.

Apr 23 12:26:26 rpi Node-RED[9884]: Lost connection to D-Bus.
Apr 23 12:26:31 rpi Node-RED[9884]: 2024-04-23T10:26:31.779Z node-red-contrib-victron:victron-client Retrying dbus connection.
Apr 23 12:26:31 rpi Node-RED[9884]: 2024-04-23T10:26:31.779Z node-red-contrib-victron:dbus Connecting to TCP address tcp:host=192.168.42.17,port=78.
Apr 23 12:26:54 rpi Node-RED[9884]: Error connecting to dbus: Error: connect EHOSTUNREACH 192.168.42.17:78
Apr 23 12:27:04 rpi Node-RED[9884]: 2024-04-23T10:27:04.297Z node-red-contrib-victron:victron-client Retrying dbus connection.
Apr 23 12:27:04 rpi Node-RED[9884]: 2024-04-23T10:27:04.298Z node-red-contrib-victron:dbus Connecting to TCP address tcp:host=192.168.42.17,port=78.
Apr 23 12:27:07 rpi Node-RED[9884]: Error connecting to dbus: Error: connect EHOSTUNREACH 192.168.42.17:78
Apr 23 12:27:27 rpi Node-RED[9884]: 2024-04-23T10:27:27.379Z node-red-contrib-victron:victron-client Retrying dbus connection.
Apr 23 12:27:27 rpi Node-RED[9884]: 2024-04-23T10:27:27.380Z node-red-contrib-victron:dbus Connecting to TCP address tcp:host=192.168.42.17,port=78.
Apr 23 12:27:27 rpi Node-RED[9884]: 2024-04-23T10:27:27.396Z node-red-contrib-victron:dbus Connected to D-Bus.

The current drawbacks / unresolved points:

1. Retry timeout will progressively increase and never reset back

Retry will backoff exponentially with each attempt, so timeout will progressively increase from 5 seconds to 60 seconds and will never reset back on successful connection. Potential solution would be to not use exponential backoff but rather random reconnect timeout within certain range (already supported by the promise-retry lib).

Or wrap inner connect retry with outer retry (infinite while loop), but that relies on propagating disconnect event back from the inner retry. Perhaps we can use the Promise.resolve() for that, which would mean that when the inner connect promise is rejected the connect failed, and when it is resolved the connection successfully ended and needs to be tried again.

2. Promise without resolve?

Never before have I done never calling Promise.resolve() so I am not sure whether it is safe for production, I think yes because the code will essentially always Promise.reject()