emsesp / EMS-ESP

ESP8266 firmware to read and control EMS and Heatronic compatible equipment such as boilers, thermostats, solar modules, and heat pumps
https://emsesp.github.io/docs
GNU Lesser General Public License v3.0
302 stars 96 forks source link

Missing retained messages at reconnect after broker restart #631

Closed FredericMa closed 3 years ago

FredericMa commented 3 years ago

Bug description If ems-esp reconnects after the broker has been restarted, not all messages that should be retained are being retained. I'm missing status, heating_active, tapwater_active and info. If I reboot ems-esp afterwards, the missing retained messages are available. I noticed this since the ems-esp status showed up as unknown after a server restart (including broker). I connected to the broker and saw that the retained status message was missing. I restarted ems-esp and all was fine afterwards.

Steps to reproduce I can simulate this by restarting the broker and checking the retained messages. The mentioned messages above are missing. I restart ems-esp and check again and the missing messages will show up as retained.

Expected behavior Retained messages should also be available after it reconnects to the broker if it is restarted.

Screenshots After broker restart: image

After ems-esp restart: image

Device information

{
  "System": {
    "version": "2.1.1b3",
    "uptime": "000+00:36:32.301",
    "freemem": 38,
    "fragmem": 10
  },
  "Settings": {
    "enabled": "on",
    "publish_time_boiler": 10,
    "publish_time_thermostat": 10,
    "publish_time_solar": 10,
    "publish_time_mixer": 10,
    "publish_time_other": 10,
    "publish_time_sensor": 10,
    "mqtt_format": 2,
    "mqtt_qos": 0,
    "mqtt_retain": "on",
    "tx_mode": 3,
    "ems_bus_id": 11,
    "master_thermostat": 0,
    "rx_gpio": 13,
    "tx_gpio": 15,
    "dallas_gpio": 14,
    "dallas_parasite": "off",
    "led_gpio": 2,
    "hide_led": "on",
    "api_enabled": "on",
    "bool_format": 1,
    "analog_enabled": "off"
  },
  "Status": {
    "bus": "connected",
    "bus protocol": "HT3",
    "#telegrams received": 2442,
    "#read requests sent": 386,
    "#write requests sent": 0,
    "#incomplete telegrams": 0,
    "#tx fails": 3,
    "rx line quality": 100,
    "tx line quality": 100,
    "#MQTT publish fails": 0,
    "#dallas sensors": 0
  },
  "Devices": [
    {
      "type": "Boiler",
      "name": "Condens 2500/Logamax/Logomatic/Cerapur Top/Greenstar/Generic HT3 (DeviceID:0x08, ProductID:95, Version:23.04)",
      "handlers": "0x10 0x11 0x14 0x15 0x16 0x18 0x19 0x1A 0x1C 0x2A 0x33 0x34 0x35 0xD1 0xE3 0xE4 0xE5 0xE6 0xE9 0xEA"
    },
    {
      "type": "Thermostat",
      "name": "Junkers FW200 (DeviceID:0x10 ProductID:106, Version:12.14)",
      "handlers": "0xA3 0x06 0xA2 0x16F 0x170 0x171 0x172 0x165 0x166 0x167 0x168"
    },
    {
      "type": "Mixer",
      "name": "Junkers IPM (DeviceID:0x20 ProductID:102, Version:20.08)",
      "handlers": "0x10C"
    },
    {
      "type": "Mixer",
      "name": "Junkers IPM (DeviceID:0x21 ProductID:102, Version:20.08)",
      "handlers": "0x10C"
    },
    {
      "type": "Solar",
      "name": "Junkers ISM1 (DeviceID:0x30 ProductID:101, Version:23.04)",
      "handlers": "0x103 0x101"
    },
    {}
  ]
}
proddy commented 3 years ago

I'll try and reproduce here. Thanks for reporting.

MichaelDvP commented 3 years ago

It's by design. status and info are only published on first connect, not on reconnects. heating_active and tapwater_active are only published on change. status ist always published with retain flag, because it is used as last-will, the others as configured in user-settings.

Afaik retains means, that the broker should keep the message. If the broker is stopped the message is lost.

We can generate a new info messages with "event": "reconnect".

proddy commented 3 years ago

I can't reproduce. I set the MQTT retain to true in EMS-ESP. MQTT Clean session off. Start EMS-ESP. All the topics coming in have the retained flag. Re-start the MQTT broker and the messages are still there. This is what retain does, it makes them sticky.

FredericMa commented 3 years ago

It's specifically the status message that triggered my attention since the ems-esp information was displayed as unavailable. I'm using regular MQTT, not the HA discovery. So if the broker is restarted, entities like this will remain unavailable forever until ems-esp is restarted:

  - platform: mqtt
    state_topic: 'ems-esp/heartbeat'
    name: 'heating_ems_esp_wifi'
    unit_of_measurement: '%'
    value_template: '{{ value_json.rssi }}'
    availability_topic: 'ems-esp/status'
    payload_available: "online"
    payload_not_available: "offline"

The exact order of events in my case is like this:

The only way to solve this in my case is to reboot ems-esp after every server reboot.

proddy commented 3 years ago

the status topic is always retained. It'll never disappear unless you purge the MQTT broker.

FredericMa commented 3 years ago

Do you have persistence to disk enabled on your broker? I'm using mosquitto and by default persistence to disk is disabled. I never changed it so persistence is disabled. That's probably the reason why you are seeing the messages after a broker restart since they are loaded from disk and in my case mosquitto is starting clean.

proddy commented 3 years ago

that's correct. my mosquitto config has

persistence true
persistence_location /mosquitto/data/
FredericMa commented 3 years ago

Is there any disadvantage why status (and maybe others like info) should not be published on reconnect?

Another reason why I'm not a big fan of enabling persistence on the broker is the fact that you don't know if the data is still accurate. Let's say the broker goes offline and comes up again after 2 hours but it's possible that in the meantime the ems-esp went offline. In this case you get a false positive that the device is online while it is actually not. Eventually the will message will kick in I suppose. So it will solve itself after a while but it might have already triggered automations in your home automation system which might not give the expected result.

proddy commented 3 years ago

Actually no reason. I'll move status and info to the MQTT reconnect code.

FredericMa commented 3 years ago

Great thanks! At the moment I'm not using heating_active, tapwater_active so no issue for me but won't they suffer the same issue? If they would have the same issue, it is possible that for example HA thinks tapwater is active while it might not be active and won't notice it until it changes. So in that case you won't even notice it has been not active.

MichaelDvP commented 3 years ago

Let's say the broker goes offline and comes up again after 2 hours but it's possible that in the meantime the ems-esp went offline. In this case you get a false positive that the device is online while it is actually not.

For this case there is the last-will and the broker sets offlineto the retained status if ems-esp is not connected.

Without saving there are a lot of other publishes lost: The HA-config is publishes once with retain on device registering. The device-messages seems to publish on reconnect, but this is only the queued messages while offline are send out. If you have large intervalls you have to wait for the next message.

proddy commented 3 years ago

also true, having the retained flag set and telling the broker not to persist the messages sounds counter-intuitive?

FredericMa commented 3 years ago

Yes, you're right, in that case it is covered if you configure availability for all entities.

I enabled the retain flag so when HA starts it immediately uses the latest received data, otherwise they show up as unavailable until the next update. OK, it will only take 10 seconds before they get updated to the correct state instead of unavailable but I think it is nicer to use the latest retained messages.

proddy commented 3 years ago

the status topic is published on MQTT re-connect. The others don't make sense to add.

FredericMa commented 3 years ago

Thanks! Would it make sense for the other topics to follow the retain flag setting? I can imagine that it would be confusing for people that have the retain flag enabled but not all topics are actually retained. Like I said, no issue for me since I don't use them at the moment.

proddy commented 3 years ago

if the retain flag is set, all topics are sent with the retained flag set. It's always been like this. It's up to the broker how it retains them, so if it doesn't persist to disk it'll never be able to recall the messages when the broker restarts. The change I made is to send out the status topic on every MQTT re-connect.

FredericMa commented 3 years ago

Correct, sorry, brainfart from my side.