victronenergy / node-red-contrib-victron

MIT License
87 stars 18 forks source link

[BUG] Victron Node RED Nodes do not survive Cerbo GX reboot when connected to DBus via TCP #189

Closed mman closed 1 month ago

mman commented 5 months ago

Describe the bug

I am running Node Red on Raspberry PI and connecting to DBus on Cerbo GX by specifying environment variable NODE_RED_DBUS_ADDRESS=<IP>:78. This works excellent until Cerbo GX reboots due to a firmware update for example.

To Reproduce Steps to reproduce the behavior:

  1. Configure node red on Linux with NODE_RED_DBUS_ADDRESS=<IP>:78 where <IP> is remote Cerbo.
  2. Add for example Battery Monitor Victron Node to the dashboard, and let it report values.
  3. Open Cerbo GX Remote Console and do Settings->General->Reboot.
  4. All Victron nodes go to disconnected state.

Expected behavior Victron Nodes should become disconnected when Cerbo GX disappears, but probably should attempt to reconnect periodically with exponential backoff when going into that state.

Screenshots Normal functionality:

Screenshot 2024-01-30 at 15 32 18

After Cerbo GX reboot:

Screenshot 2024-01-30 at 15 33 28

Node RED system log confirms DBus is gone:

$ systemctl status nodered
● nodered.service - Node-RED graphical event wiring tool
     Loaded: loaded (/lib/systemd/system/nodered.service; enabled; preset: enabled)
     Active: active (running) since Mon 2024-01-29 20:48:03 CET; 18h ago
       Docs: http://nodered.org/docs/hardware/raspberrypi.html
   Main PID: 606 (node-red)
      Tasks: 15 (limit: 8733)
        CPU: 50min 9.014s
     CGroup: /system.slice/nodered.service
             └─606 node-red

Jan 30 09:20:46 raspberrypi-ismove Node-RED[606]: 30 Jan 09:20:46 - [info] [ui-base:Dashboard] Disconnected x3uQYPOfjCJQr9CDAAAt due to transport close
Jan 30 09:52:21 raspberrypi-ismove Node-RED[606]: 30 Jan 09:52:21 - [info] [ui-base:Dashboard] Disconnected 6q9Kj30hUMwCJ1H_AAAv due to transport close
Jan 30 10:46:05 raspberrypi-ismove Node-RED[606]: 30 Jan 10:46:05 - [info] [ui-base:Dashboard] Disconnected PRqk4Z3Ib1mqUkseAAAx due to transport close
Jan 30 10:49:57 raspberrypi-ismove Node-RED[606]: 30 Jan 10:49:57 - [info] [ui-base:Dashboard] Disconnected b77caO1Qso5tdcCtAAAz due to transport close
Jan 30 11:50:01 raspberrypi-ismove Node-RED[606]: 30 Jan 11:50:01 - [info] [ui-base:Dashboard] Disconnected r3GtAqQdmQq6KTSIAAA1 due to transport close
Jan 30 12:17:54 raspberrypi-ismove Node-RED[606]: 30 Jan 12:17:54 - [info] [ui-base:Dashboard] Disconnected grPQsv4G8m61PFsjAAA3 due to ping timeout
Jan 30 13:02:01 raspberrypi-ismove Node-RED[606]: 30 Jan 13:02:01 - [info] [ui-base:Dashboard] Disconnected Mql-c-WLe-JIzUNQAAA6 due to transport close
Jan 30 13:06:20 raspberrypi-ismove Node-RED[606]: 30 Jan 13:06:20 - [info] [ui-base:Dashboard] Disconnected ASE0Hf9geOcsLB73AAA8 due to transport close
Jan 30 15:22:52 raspberrypi-ismove Node-RED[606]: 30 Jan 15:22:52 - [info] [ui-base:Dashboard] Disconnected eb3MBrrMYLGk1dylAAA- due to transport close
Jan 30 15:32:59 raspberrypi-ismove Node-RED[606]: Lost connection to D-Bus.

Additional context

The code responsible for handing error/disconnect from DBus seems to live here, but I am not sure what it does after it propagates the error up with reject.

https://github.com/victronenergy/node-red-contrib-victron/blob/24953fc521ce648f60a4ce1c0a7190f5442954f1/src/services/dbus-listener.js#L114

dirkjanfaber commented 5 months ago

I am able to reproduce it. Not sure yet how to fix it though.

mman commented 5 months ago

@dirkjanfaber I have faced similar issue in Venus Influx Loader where TCP connection to InfluxDB may get interrupted and needs to be re-established if that happens. In Venus Influx Loader data collected in the meantime are cached and then flushed.

In this repo it is easier since we only need to reconnect periodically if the dbus connection fails.

In Venus Influx Loader this is done by invoking and re-scheduling the start method linked here:

https://github.com/victronenergy/venus-influx-loader/blob/c914a0eef719bee8c3f7f6de17e25ef59ee09fe6/src/server/influxdb.js#L44-L52

I can take a look next week...

etofi commented 1 month ago

Is there a solution to this in the meantime? After rebooting the Cerbo in, all my nodes are offline.

dirkjanfaber commented 1 month ago

Yes, it was handled in this PR and already part of the pre-released version 1.5.7. Which will be part of Venus in the next beta release. After which I'll release it for the rest of the users too.

Meanwhile you can download the .tar.gz file from here too: https://github.com/victronenergy/node-red-contrib-victron/releases/tag/v1.5.17

etofi commented 1 month ago

Yes, it was handled in this PR and already part of the pre-released version 1.5.7. Which will be part of Venus in the next beta release. After which I'll release it for the rest of the users too.

Meanwhile you can download the .tar.gz file from here too: https://github.com/victronenergy/node-red-contrib-victron/releases/tag/v1.5.17

Great - thank you very much. Why do I only get "@victronenergy/node-red-contrib-victron" version 1.5.15 via Node-Red? Shouldn't that already be 1.5.16?

dirkjanfaber commented 1 month ago

Yes, it should. And now I accidentally updated it to become 1.5.17 already. Which is no problem. In a few hours all of the node-red caches have been updated and you should be able to upgrade to 1.5.17 from node red.