emsesp / EMS-ESP

ESP8266 firmware to read and control EMS and Heatronic compatible equipment such as boilers, thermostats, solar modules, and heat pumps
https://emsesp.github.io/docs
GNU Lesser General Public License v3.0
302 stars 96 forks source link

EMS entities not available after reboot HA platform #567

Closed stravinci closed 3 years ago

stravinci commented 3 years ago

Hi, As you can see in screenshot my EMS entities are unavailable, only EMS status is available (connected) and updated every few seconds. image This issue occur every full reboot of platform with HA and MQTT. Not reproduce after restart only HA. Not reproduce after restart only MQTT. Workaround to get entities back: reboot EMS ESP. No problems with any another MQTT devices/entities. Any idea where is issue and how to solve it?

proddy commented 3 years ago

Which version are you on? Can you send me the output of http://ems-esp/api?device=system&cmd=report ?

The HA config looks old, probably from a previous v2 release. The "EMS-ESP status" shouldn't be there. Which EMS-ESP devices do you see in HA? Should look like:

Screenshot 2020-10-18 225918

stravinci commented 3 years ago

I'm on 2.1.0b6 version. image

Address http://192.168.2.210/ems-esp/api?device=system&cmd=report redirect me to http://192.168.2.210/ems-esp/devices Address http://192.168.2.210/api?device=system&cmd=report return "Invalid cmd"

proddy commented 3 years ago

Upgrade to b8 and try again please

stravinci commented 3 years ago

Reproduced, output from API: { "System": { "version": "2.1.0b8", "uptime": "000+00:09:14.009", "freemem": 45, "fragmem": 12 }, "Settings": { "publish_time_boiler": 10, "publish_time_thermostat": 0, "publish_time_solar": 0, "publish_time_mixing": 10, "publish_time_other": 10, "publish_time_sensor": 10, "mqtt_format": 3, "mqtt_qos": 0, "mqtt_retain": "on", "tx_mode": 1, "ems_bus_id": 11, "master_thermostat": 0, "rx_gpio": 13, "tx_gpio": 15, "dallas_gpio": 14, "dallas_parasite": "off", "led_gpio": 2, "hide_led": "off", "api_enabled": "on", "bool_format": 1, "analog_enabled": false }, "Status": { "bus": "connected", "bus protocol": "HT3", "#telegrams received": 384, "#read requests sent": 85, "#write requests sent": 0, "#incomplete telegrams": 5, "#tx fails": 3, "rx line quality": 100, "tx line quality": 100, "#MQTT publish fails": 0, "#dallas sensors": 0 }, "Devices": [ { "type": "Boiler", "name": "HT3 (DeviceID:0x08, ProductID:95, Version:23.12)", "handlers": "0x10 0x11 0x14 0x15 0x16 0x18 0x19 0x1A 0x1C 0x2A 0x33 0x34 0x35 0xD1 0xE3 0xE4 0xE5 0xE6 0xE9 0xEA" }, { "type": "Thermostat", "name": "FW120 (DeviceID:0x10, ProductID:192, Version:53.02)", "handlers": "0xA3 0x06 0xA2 0x16F 0x170 0x171 0x172 0x165 0x166 0x167 0x168" }, { "type": "Controller", "name": "HT3 (DeviceID:0x09, ProductID:95, Version:23.12)", "handlers": "" } ] }

proddy commented 3 years ago

are you just getting these errors with the Boiler in HA, or also the Thermostat?

stravinci commented 3 years ago

Boiler and Thermostat. 'EMS-ESP' device with 'EMS-ESP status' entity is online and get updates.

proddy commented 3 years ago

It looks like HA discovery is working but the topics can't be found. Did you change the hostname? Could you use MQTTExplorer and take a look at some of the examples both in the homeassistant/sensor/esm-esp/* and ems-esp/* like on my system:

1 2

stravinci commented 3 years ago

Before restart: image

After restart: image

But in HA entities are unavailable.

stravinci commented 3 years ago

image

proddy commented 3 years ago

no topics in homeassistant/sensor/esm-esp/* ?

stravinci commented 3 years ago

Only this one: image

proddy commented 3 years ago

for some reason the mqtt messages are not getting through. can you try with the latest b9 build from https://github.com/proddy/EMS-ESP/tree/firmware/firmware ?

stravinci commented 3 years ago

image

proddy commented 3 years ago

any errors in the HA logs? any errors in the MQTT broker?

because I'm baffled on why it's not working

proddy commented 3 years ago

I made a b10 for you to try. It has some more debugging so we can see what is happening. In the Console do

% su
% log err
% publish ha

and see if you get an red error text appearing. hope not!

stravinci commented 3 years ago

Hah, I'm trying to create second MQTT broker for test, but now I will test your new build.

stravinci commented 3 years ago

image After that command I see all entities in HA.

proddy commented 3 years ago

is this on your local mqtt broker, like mosquitto.exe ? I'm still not sure why it's not working for you.

stravinci commented 3 years ago

Yes, it is local MQTT at same machine with HA. I think this machine is slow during starting and that is why you get that errors. I don't know where but I read that HA integrations tries to reconnect after delay 30s - from above log I understand that you have 3 retry with 10s delay? and in 1 retry is 3 attempts? Then you stop communications with MQTT?

proddy commented 3 years ago

the retries are actually after 200ms, very short. What I would try next is

we'll get it fixed I'm sure!

stravinci commented 3 years ago

ems_log.txt mqtt_log.txt

proddy commented 3 years ago

from the logs looks like it works?

stravinci commented 3 years ago

Yes, as I wrote in first post "Not reproduce after restart only MQTT." so I think this is performance/timing issue. Can you create build with extend delay between retry?

proddy commented 3 years ago

are you able to compile and upload the code yourself?

stravinci commented 3 years ago

Hi, yes I build minute ago and upload it to my EMS - work fine.

proddy commented 3 years ago

You can change the delay time here: https://github.com/proddy/EMS-ESP/blob/f39daa1d89a001bcf64a744fd6487c28f69b097a/src/mqtt.h#L179

proddy commented 3 years ago

I also noticed in your config you have the Publish Time for thermostat and solar set to 0, which will probably cause flooding on your network. Try putting that back to 10.

stravinci commented 3 years ago

Ok, so I change 0 to 10 as you suggest. But it does not help. Soon I will try to change delay in code. But please look at this log. In line 146 I see that connection with MQTT back, can you verify that after back connection proper HA configurations was send? ems2_log.txt

proddy commented 3 years ago

the MQTT messages sent to HA are with the 'retain' flag so they never die. even if HA or EMS-ESP reboots. What is more concerning is line 127 with "[mqtt] MQTT disconnected: TCP". The connect is breaking a lot. I'm not sure why - you will need to check the mosquitto logs. Are you running the MQTT broker on a slow rPI in away from a strong wifi signal for example?

stravinci commented 3 years ago

That disconnect was triggered by me - I tried to reproduce my issue - reboot platform with HA and MQTT.

proddy commented 3 years ago

ok so looks like we're dealing with a slow network here. perhaps just slowing down the publishes will help. Did you say everything worked fine before in 2.0.1? It seems to work fine on a local mosquitto broker but not the one running next to your HA server. Is that correct?

stravinci commented 3 years ago

I have no issues like that at 1.9 and above versions. Partially correct - I can restart current MQTT service and all still works fine, I can restart windows MQTT service and all works fine, but when I restart machine(Android) with HA and MQTT services then I have this issue.

proddy commented 3 years ago

then I think its an issue with you MQTT broker running on your android box. I'll close this and re-open if others are experiencing issues.

stravinci commented 3 years ago

Hi, sorry for delayed response. Of course I can add reset EMS after reset HA, but this will be workaround in my opinion. I tried to find issue in your code and I think I found it.

With my little knowledge of C++ I tried to develop fix. I attach patch with my work. With this patch boiler data come back after restart my Android machine, in thermostat entities issue still exist, but I don't analyze it because this code is not ready to implement and I think that you can implement it faster than me.

Basically this patch contains reset mqtt_haconfig private property in each device class after disconnect from MQTT.

0001-draft-with-working-boiler-but-not-working-thermostat.zip

proddy commented 3 years ago

if I understand correctly, your proposed solution is to re-publish all the HA topics when the MQTT broker connection is made? So if the broker is restarted, EMS-ESP will get a TCP Disconnected error and re-connect.

The thing is, all these MQTT messages to HA have the 'retained' flag set so this shouldn't be necessary ?

stravinci commented 3 years ago

I'm not sure what exactly do 'retain' flag during/after restart MQTT broker. But here is screen from MQTT explorer before restart broker: image

And after restart broker, not all messages has RETAINED flag, in homeassistant topic no one has that flag - maybe here is an issue? image

proddy commented 3 years ago

it looks ok to me. After the broker restart all the homeassistant/* topics are still there. This is because they are 'retained' messages.

I'm not sure what is not working?

stravinci commented 3 years ago

There shouldn't exist topics under homeassistant?

stravinci commented 3 years ago

Hi, at version v2.1.1b1 after restart EMS ESP device entities does not come back. After execute command "publish ha" EMS has restart...

proddy commented 3 years ago

ok, we need to figure out what is going on again. The publish ha indeed breaks and causes a restart. I'll fix that

stravinci commented 3 years ago

I got error after update tov2.1.1b1: image

proddy commented 3 years ago

I'll look into. lots of things not working as well as I want with the HA integration.

proddy commented 3 years ago

Michael made some improvements to loading the HA MQTT messages which should address this issue. It's in the latest dev build 2.1.1b2

stravinci commented 3 years ago

Yes, errors not come back, thanks! Issue with reboot HA+MQTT platform still exist.

proddy commented 3 years ago

I'm closing this. I can't reproduce or understand what is happening on your system. From the screenshots everything looks fine. It may be an MQTT or HA config problem. If others experience similar problems we'll reopen and take a closer look.