victronenergy / venus-influx-loader

NodeJS server that takes from MQTT into Influx, and config UI and still more
MIT License
14 stars 6 forks source link

Is it correct to send the KeepAlive at 62 seconds? #71

Closed frbuceta closed 4 months ago

frbuceta commented 6 months ago

If the FashMQ KeepAlive is 60 seconds, why does this project send the KeepAlive message at 62 seconds?

https://github.com/victronenergy/venus-influx-loader/blob/309b7209af59d461778fb9d0314f9aa07574a49d/src/server/loader.js#L7

If you watch MQTT with MQTT Explorer you will see that it stops broadcasting before sending the KeepAlive message.

mman commented 6 months ago

This code has been there before the migration to FlashMQ, let me please investigate the history behind this and get it addressed appropriately...

mman commented 5 months ago

@frbuceta Thanks for pointing out this issue. Looking at the code it looks like we are actually using the legacy method as described here: https://github.com/victronenergy/dbus-mqtt/blob/master/README.md#using-the-system0serial-method.

Now if I understand it correctly, that method would cause the Venus system to dump all monitored values to MQTT every 62 seconds, and then continuously publish only the changed values. With new implementation that may cause repeated load on the Venus device and possibly storing unchanged data into InfluxDB repeatedly.

https://github.com/victronenergy/venus-influx-loader/blob/309b7209af59d461778fb9d0314f9aa07574a49d/src/server/loader.js#L72-L75

We should probably switch to the new method: https://github.com/victronenergy/dbus-mqtt/blob/master/README.md#summary-the-important-bits where we subscribe to everything and re-send the keep alive every 30 seconds as outlined in the document.

That way we should only be receiving changed values, and not everything every minute, thus reducing the InfluxDB storage needs a bit.

I have to perform some tests on real devices to confirm the behaviour and see how to properly merge this.

wiebeytec commented 5 months ago

What @mman suggests is outdated already :)

The dbus-flashmq plugin contains the correct documentation on how to do keepalives.

If the FashMQ KeepAlive is 60 seconds, why does this project send the KeepAlive message at 62 seconds?

I suspect this is a faulty implementation. I don't know which project it is or was anymore, but I have heard of some project that refreshes the keep-alive at a period longer than the timeout, to force republish of all messages. Must be this project then.

This is how the old dbus-mqtt system behaved: it would send all topics on wake-up. By making it wake up all the time, it caused republishes. But, this system would fail if there were too clients keeping it alive: it was not a deterministic implementation.

The dbus-flashmq docs show how to use keep-alives to get a full listing, or how to just keep it from going to sleep.

mman commented 5 months ago

Thanks @wiebeytec for the explanation. My reading of the situation is the same. Intentionally sending keep alive at 62 seconds when we know 60 is the limit would cause the Venus to dump everything, and then save it to InfluxDB, but this is obviously neither necessary nor good as many dumped values do not change over time just increasing the storage requirements. I will migrate the Venus Influx Loader code to request full dump at first but then only retrieve incremental changes.

Thanks for your help, Martin

wiebeytec commented 5 months ago

There is some case to be made to request all values every minute, to make graph drawing easier (avoid gaps). Vrm-logger on Venus does this too: values are sent on change and interval.

If you do go that route, the new keep-alive mechanism provides more facilities for that. However, it will have the same non-determinism issue: if two people send keep-alives, the amount of points to store will double.

mman commented 5 months ago

There is some case to be made to request all values every minute, to make graph drawing easier (avoid gaps). Vrm-logger on Venus does this too: values are sent on change and interval.

That's a good point actually, but since we are visualising in Grafana - where one nice feature is to have super granular visibility into the data, it may actually be better to really send just changes.

Grafana can take care of smoothing the data nicely via linear or bicubic interpolation, while drawing circles at exact points recorded thus revealing even more information about when exact data point was actually pushed out.

For example the screenshot below shows how the SoC was recorded in super granular detail, where I can see that SoC gets pushed out every ~ 30 seconds, but each measurement is doubled within a second, which I assume may be a side effect of how the keep alive is implemented in the legacy code... will investigate more.

Screenshot 2024-04-22 at 11 37 23
wiebeytec commented 5 months ago

Getting a bit off-topic, but, technically, interpolation is incorrect, because it's not a continous time signal. A lollipop graph would be a better representation of sampled data. SoC is just one where it does work, but something line line voltage it wouldn't. Just because point A is 230 V and point B is 240 V, doesn't mean between that is 235 V.

What I have observed/concluded users really need, is not just a value, but what that value did over the course of the post interval. So, with an average/median + min/max or something.

mman commented 5 months ago

Getting a bit off-topic, but, technically, interpolation is incorrect, because it's not a continous time signal. A lollipop graph would be a better representation of sampled data. SoC is just one where it does work, but something line line voltage it wouldn't. Just because point A is 230 V and point B is 240 V, doesn't mean between that is 235 V.

What I have observed/concluded users really need, is not just a value, but what that value did over the course of the post interval. So, with an average/median + min/max or something.

Absolutely agree, what I wanted to say was that in the end it may be better to only visualize changes that really happened on Venus, instead of generating duplicate data points by requesting all dubs data repeatedly.

mman commented 4 months ago

Addressed by https://github.com/victronenergy/venus-influx-loader/commit/eeb4202f2908187312d2e21940931595f056475d