freakent / dbus-mqtt-devices

A driver for Victron VenusOS GX devices to support use of dbus-mqtt/flashmq-mqtt
MIT License
111 stars 24 forks source link

PV Inverter missing after reboot #17

Closed Hofyyy closed 1 year ago

Hofyyy commented 2 years ago

I hope I tested enough for this issue.

Problem: After power loss or reboot from the CCGX system, the devices are not registered anymore.

Normal behaviour:

Assumtion Bug:

From what I see, on the MQTT side all looks perfekt.

I had the issue after a complete power shutdown, and then I did not see my PV inverter anymore and checked what could be wrong.

stundenblume commented 2 years ago

The dbus-mqtt-devices driver works that way that you have to send to the device/<client id>/Status topic to register a device. I would recommend to send the appropriate massage every minute from the Tasmota device if you expect your CCGX device to reboot often.

This should do the trick: ON Time#Minute DO Publish device/%deviceid%/Status {"clientId":"%deviceid%","connected":1,"version":"v1.0 ALPHA","services":{"wr":"pvinverter"}} ENDON

freakent commented 2 years ago

Perhaps a retained MQTT message or QOS would help. However in my client code I keep retrying the status connect =1 until I get a device Id from the driver. My Arduino just reconnects and I never have to touch it.

Hofyyy commented 2 years ago

Ok I will change the tasmota ruleset to cover that.

My usecase is not to reboot the CCGX often :-) But also if it is only one power off per year, I do not want to fix it afterwards with a manual step. #robustness. So great if their is a solution. I think I will add ON Time#Minute|15 ...

freakent commented 2 years ago

I would recommend to send the appropriate massage every minute from the Tasmota device if you expect your CCGX device to reboot often.

Is the Tasmota ruleset flexible enough to enable to do this only if you don't have a device Id? I'm not a fan of continuously polling, especially since MQTT is such a grey event based system.

freakent commented 2 years ago

Can you also check the dbus-Mqtt-devices log file as per the trouble shooting tips). Let's see if there is possible anything else causing this issue.

stundenblume commented 2 years ago

How does your Arduino device detect if a deviceID disappears?

Tasmota would have to check if W/<portalID>/[...]/pvinverter/<x> is present.

Most of the Tasmota devices are ESP8266 based. These devices can't subscribe to topics with the precompiled firmware.

1h polling would also be possible or I could provide a compiled version with the necessary features.

freakent commented 2 years ago

On Mqtt disconnect you could just set the device Id to "".

I looked at my Arduino code again just now. Despite what I said earlier,I realised I don't actually do anything to handle this situation. After I connect to Mqtt I just send the registration message.

There must only be microseconds between MQTT starting and the dbus-mqtt-driver starting. I am surprised you were able to see it in real time in Mqtt explorer. We need to look at the log file.

Hofyyy commented 2 years ago

Some news from my side.

I agree that the solution is not perfect, but working.

Some more insides:

Hofyyy commented 2 years ago

I can send every logfile you need, please describe how to find and I will send it to you.

freakent commented 2 years ago

/var/log/dbus-mqtt-devices/current

As per the trouble shooting section in the Readme.

Hofyyy commented 2 years ago

Sorry for not reading the trouble section in detail. You will find the log attached log.txt .

stundenblume commented 2 years ago
* Manuel reboot of tasmota -> Worst solution, even if it is only once a year.

@Hofyyy You mean: Manual reboot of CCGX?

Hofyyy commented 2 years ago

@stundenblume No: I mean a manual reboot of the "Tasmota" device to force a reregistration on MQTT Connect signal during CCGX is running fine.

stundenblume commented 2 years ago

Sorry, missunderstud the situation. I read "situation" instead of "solution". :-D

stundenblume commented 2 years ago

On Mqtt disconnect you could just set the device Id to "".

I looked at my Arduino code again just now. Despite what I said earlier,I realised I don't actually do anything to handle this situation. After I connect to Mqtt I just send the registration message.

There must only be microseconds between MQTT starting and the dbus-mqtt-driver starting. I am surprised you were able to see it in real time in Mqtt explorer. We need to look at the log file.

I think we have two situations here.

    1. CCGX and MQTT device (eg Tasmota) are working fine together.
      1. CCGX device reboots
      2. Tasmota don't realize that it has to resend the "init massage" (publishing to device/<client id>/Status).
    1. CCGX and MQTT device (eg Tasmota) are working fine together.
      1. There is a power failure and both, the CCGX and Tasmota device are booting new
      2. Can it be possible that the Tasmota device is connecting to the MQTT broker and sends the "init massage" faster than your driver is up an running?

I think both situations should be handled by the device, not the driver.

Hofyyy commented 2 years ago

Point 2. exact my situation Point 1. i am not sure if it is not realised. I think it is the same like point 2 or?

freakent commented 2 years ago

In situation 1,

Tasmota don't realize that it has to resend the "init massage"

The device will be disconnected from mqtt, therefore it will have to reconnect to continue sending readings. The golden rule is that the device must always send registration message whenever it connects to mqtt.

Hofyyy commented 2 years ago

I can test situation 1. but i think it works because of mqtt reconnect.

gharshaw commented 2 years ago

My situation is a bit different as I have node-red communicating with CCGX. Would there be any issue with sending the connect registration message say every minute on a timer as a connect keep-alive? And this way if CCGX reboots the registration connect will be re-established

stundenblume commented 2 years ago

Maybe we should 5sec after MQTT#Connect before we send the registration.

ON mqtt#connected DO RuleTimer1 5 ENDON
ON Rules#Timer=1 DO Publish device/%deviceid%/Status {"clientId":"%deviceid%","connected":1,"version":"v1.0 ALPHA","services":{"wr":"pvinverter"}} ENDON
Hofyyy commented 2 years ago

@stundenblume Thats sound perfect. I will test that in detail :-)

freakent commented 2 years ago

@gharshaw I'm not sure what version of NodeRed you are running, but should be able to configure a "birth" message, a disconnect message and an unexpected disconnect message.

image
stundenblume commented 2 years ago

@Hofyyy, did you had a chance to test it? Was it OK with a 5sec. delay? Sebastian

Hofyyy commented 2 years ago

I am on a business trip until monday. so i will test next week

Hofyyy commented 2 years ago

I do not understand completely whats going on, but here are my first results.

  1. 10 sec are working 5 seconds are not, see rules at the end.
  2. When CCGX is running, a reboot of the tasmota with 5 or 10 sec. Rule is working all the time!?!!!?!?
  3. Critical Test usecase -> Reboot CCGX and see if Tasmota automatically reregister and come back

If 5 sec:

Here are the current working ruleset, but I think this needs further research and log file research.

PV1

Rule1 ON System#Init DO Backlog mem1 ; mem2 ; mem3 1; PowerDelta 101; TelePeriod 10 ENDON

ON mqtt#connected DO RuleTimer1 10 ENDON ON Rules#Timer=1 DO Publish device/%deviceid%/Status {"clientId":"%deviceid%","connected":1,"version":"v1.0 ALPHA","services":{"wr":"pvinverter"}} ENDON

ON Tele-Energy#Power DO Backlog Publish W/%mem1%/pvinverter/%mem2%/Ac/Power {"value": %value%}; Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Power {"value": %value%} ENDON ON Tele-Energy#Current DO Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Current {"value": %value%} ENDON ON Tele-Energy#Voltage DO Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Voltage {"value": %value%} ENDON

PV2

Rule1 ON System#Init DO Backlog mem1 ; mem2 ; mem3 1; PowerDelta 101; TelePeriod 10 ENDON

ON mqtt#connected DO RuleTimer1 10 ENDON ON Rules#Timer=1 DO Publish device/%deviceid%/Status {"clientId":"%deviceid%","connected":1,"version":"v1.0 ALPHA","services":{"wr":"pvinverter"}} ENDON

ON Tele-Energy#Power DO Backlog Publish W/%mem1%/pvinverter/%mem2%/Ac/Power {"value": %value%}; Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Power {"value": %value%} ENDON ON Tele-Energy#Current DO Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Current {"value": %value%} ENDON ON Tele-Energy#Voltage DO Publish W/%mem1%/pvinverter/%mem2%/Ac/L%mem3%/Voltage {"value": %value%} ENDON

stundenblume commented 2 years ago

I suppose the driver needs more than 5 seconds longer than Mosquittto to start after a reboot of the CCGX. That's OK or not?

freakent commented 2 years ago

The best way to find out would be to check the log file to at precisely what time the driver started. There is command line utility that will print the log file with human readable time stamps.

https://cr.yp.to/daemontools/tai64nlocal.html

I haven't been able to try this yet myself.

Hofyyy commented 2 years ago

@freakent Hey,

i think i will be able to convert the timestamps.

Which logfile is the right one to compare the deamon starting points? Is it again: /var/log/dbus-mqtt-devices/current ?

Best regards Sebastian

freakent commented 2 years ago

@Hofyyy Yes, that's the one.

Hofyyy commented 2 years ago

@freakent @stundenblume You will find the converted logfile below ...

In the reboot from: 2022-05-16 14:19:24.580360500 the register after 10 sec. does not work After that I rebooted both Tasmota devices and they registered at: 14:19:35 test.txt

Hofyyy commented 2 years ago

Command was: cat /var/log/dbus-mqtt-devices/current | tai64nlocal

Hofyyy commented 2 years ago

If I see it correct it needs something smaler then 18 sec.

2022-05-16 14:18:06.783930500 CCGX booted (1) to 2022-05-16 14:19:35.851691500 INFO:device_manager:[Connected] Result code 0