emsesp / EMS-ESP32

ESP32 firmware to read and control EMS and Heatronic compatible equipment such as boilers, thermostats, solar modules, and heat pumps
https://emsesp.github.io/docs
GNU Lesser General Public License v3.0
574 stars 100 forks source link

MQTT Configuration vanished without any reason #1930

Open zaphood1967 opened 1 month ago

zaphood1967 commented 1 month ago

log.txt

PROBLEM DESCRIPTION

While I have been on holidays, the MQTT Configuration got removed without any obvious reason. MQTT was disabled when I checked. After providing Broker Address, username and PW and enabling MQTT Discovery, all went back to normal. As I had been away, I am absolutely sure, nothing changed during the last two weeks, while this problem occured the second time now on my system.

REQUESTED INFORMATION

This is the same effect as I was describing here: https://github.com/emsesp/EMS-ESP32/issues/1882

EMS is on 3.6.5.

Attached you can find the log, not sure it helps.

proddy commented 1 month ago

I've had this once. I need to look into the code to find why it's happening - it seems to happen after a restart

zaphood1967 commented 1 month ago

Yes, you said so already. But as I haven't been on site and according to the log, the device seemed to be up the entire time. So I am not sure what it happened this time (again). Last time there was no reboot either before the problem. I did the reboot only during the analysis of why the device did not respond anymore. Furthermore, I am using this device now for roughly one year and only saw that happen twice within the last few weeks. So something might have been changed explicitly in the last FW update ?

zaphood1967 commented 1 month ago

And it went away again just now. Log attached. [Uploading log (1).txt…]()

proddy commented 1 month ago

Are you building the firmware manually? The 3.6.5 from the GitHub releases page hasn't changed since it was created back in March. So I don't expect its a firmware update.

I think it must be something in the code, like the MQTT settings are not loaded for some reason. I'll take a look with Michael.

zaphood1967 commented 1 month ago

Nope, no self building involved ;-) After I purchased the device I think there was just one update being offered. Downloaded the compiled file from Github then and installed it, that's it. Must have been already a while ago.

proddy commented 1 month ago

There are some nice new features coming in the next version (3.7) you may be interested in. There's a demo at https://demo.emsesp.org

zaphood1967 commented 1 month ago

Looks nice and modern, and there is even Modbus now ? Wow... ;-) What I am missing for a Buderus RC310 is mainly the option to modify heating (and ww) programs. Currently I can only select the program, but changing the schedule is only possible at the physical controller. Not sure if this is a restriction of the EMS interface or just not implemented in the EMS ESP, but it would be a very welcome feature

proddy commented 1 month ago

We get this feature request a few times, also for heating pumps. Look at the discussion in https://github.com/emsesp/EMS-ESP32/discussions/1918

@MichaelDvP is it an idea to write the switch times in a .CSV/.txt file and then upload it using the Upload/Download page. The data can be written to an internal LittleFS file and executed as a Command?

MichaelDvP commented 1 month ago

Don't know why the mqtt config was deleted. Are other confgs affected also? Please post the support info.

2024-08-08 23:41:33.000 INFO 2: [emsesp] Last system reset reason Core0: RTC watch dog reset: CPU+RTC, Core1: APP CPU reset by PRO CPU Seems there was something blocking, WTD-reset is very rare. 2024-08-09 06:01:27.715 WARNING 17: [emsesp] WiFi disconnected. Reason: assoc leave (8) There was also a wifi disconnect initiated by the AP. Does this happen often?

For the switch programs. Maybe we should start a discussion about this topic, #1918 went too much OT.

The RC300 switchprog have 6 switchpoints a day of week with time and temperature, This give 84 (6 7 2) entities for a program. Each hc have two programs and each deh have a program and a circulation program, This gives 1000 entities. Too much for the given data structure, impossible to include in the thermostat_data mqtt-topic.

We can read the telegram-data and store internally raw, This cost 1k of ram. And generate a json (or csv) on demand.

We need a concept that fit's in web/console/mqtt/api and does not blow the memory of the esp.

zaphood1967 commented 1 month ago

Don't know why the mqtt config was deleted. Are other confgs affected also? Please post the support info.

Not 100% sure, maybe the language setting was also reverted to english... but I am not sure, if it was set to German before. No other settings seemed to be affected,.

2024-08-08 23:41:33.000 INFO 2: [emsesp] Last system reset reason Core0: RTC watch dog reset: CPU+RTC, Core1: APP CPU reset by PRO CPU Seems there was something blocking, WTD-reset is very rare. 2024-08-09 06:01:27.715 WARNING 17: [emsesp] WiFi disconnected. Reason: assoc leave (8) There was also a wifi disconnect initiated by the AP. Does this happen often?

You mean, disconnect from WLAN ? At least not that I am aware of. Usually, if such happens, other devices just do a reconnect, so this would not to be noticed easily. I am using AVM Fritz Repeater 1700 in AP Mode (LAN Input, not acting as repeater but as an AP). The AP for the basement is literally on the other side of the wall where the ESP is located. Being a quite old building, this is build of bricks, not concrete and steel. So connectivity should be very solid.

Please find the requested logs attached.

zaphood1967 commented 1 month ago

For the switch programs. Maybe we should start a discussion about this topic, #1918 went too much OT.

Agree

The RC300 switchprog have 6 switchpoints a day of week with time and temperature, This give 84 (6 7 2) entities for a program. Each hc have two programs and each deh have a program and a circulation program, This gives 1000 entities. Too much for the given data structure, impossible to include in the thermostat_data mqtt-topic.

yayks.... understandable...

We need a concept that fit's in web/console/mqtt/api and does not blow the memory of the esp.

If the problem is that "large", I would totally understand when this is not going to be implemented at all.

MichaelDvP commented 1 month ago

Please find the requested logs attached.

Never post settings-file (ok, you have removed personal infos). There is the support info as first button on help- and download-page. This have no personal infos, but much more states and measures helping to debug.

I would totally understand when this is not going to be implemented at all.

There is always a way, but not easy and a lot of work.

zaphood1967 commented 1 month ago

Please find the requested logs attached.

Never post settings-file (ok, you have removed personal infos). There is the support info as first button on help- and download-page. This have no personal infos, but much more states and measures helping to debug.

Did I post the wrong one ? Maybe the attached file is the correct one? (I'd always clean logs before posting, but thanks for the reminder anyways ;-))

emsesp_system_info.txt

MichaelDvP commented 1 month ago

Thanks for the system info. It's a S32,16M with 2 large ota partitons. Heap/free alloc space looks ok. Maybe reduce the log buffer, but this should happen automatic if memory goes down.

I have no idea what happend.

  1. a watchdog reset, seems something was blocking. Maybe something happend in the (sync) mqtt.loop? @proddy Should we make the mqtt async for systems with PSRAM? But this does not help here,
  2. mqtt settings reset. I can not find anything in code that removes only these settings.

Sorry.

zaphood1967 commented 1 month ago

1) I have had increased the Log Buffer only after we started to debug. Will dial it back to 50 though. The MQTT Server is a Mosquitto that runs as an Add-on on my Homeassistant-machine. Not sure this causes any problems, as HA operates the Add-ons as Containers, hence we have NAT on the Server involved. Before this, the Mosquitto was even in a remote location and the ESP talked to it via a VPN, which did not cause any hickups.

2) Weird. Maybe a part of the flash is defective, so not all data gets saved in a clean manner?

MichaelDvP commented 1 month ago

Maybe a part of the flash is defective, so not all data gets saved in a clean manner?

Don't think so. The filesystem has wearleveling, On every change the data are stored in a different place. Mqtt settings are a single file and will be written only if there is a change via web-interface.

On startup the file is read and on read error or if the json can not be deserialized the default is set. Maybe anything (a special character) in the settings that can cause a deserialization error? But in this case there will be a reset to defaults on every reboot.

proddy commented 1 month ago

I've seen this behaviour when I was flashing/restarting an ESP32 during development - some of the MQTT settings would be reset. But it's happened like twice in 2 years and I can't reproduce it. Neither can I see anything in the code that would suggest why. I think we need to wait until it happens again.

zaphood1967 commented 1 month ago

I've seen this behaviour when I was flashing/restarting an ESP32 during development - some of the MQTT settings would be reset. But it's happened like twice in 2 years and I can't reproduce it. Neither can I see anything in the code that would suggest why. I think we need to wait until it happens again.

Is there anything I can do to prepare, assuming this occurs again? Loglevel or anything like that ? I have experienced that now 3 times and the logs have been provided, seemingly being no help for you. So if I can prepare anything that would help you guys, just let me know.

proddy commented 1 month ago

From what I understand, something is causing your EMS-ESP to restart, and when it does, some of the MQTT settings get reset. So, possibly two separate issues. A few ideas:

zaphood1967 commented 1 month ago
  • turning on SysLog to capture the logs would help, since the logs are not persisted in EMS-ESP after it restarts

Ok, I will need to set up a SysLog Target then.

  • do you know if HA or anything around the MQTT broker is changed around the same time?

No, nothing happened. As I said, I have been on vacation when it happened. So definitely no update or change anywhere

  • I assume the EMS-ESP is powered adequately and has a strong WiFi signal (as that can cause restarts)

It is powered by the connection to the Buderus Controller.

  • do you think you can easily reproduce it? Like restarting HA, if you're using HA's embedded Mosquitto broker

Did a reboot of the entire HA VM just now (including Mosquitto) -> no Effect. Then I rebooted the ESB -> No Effect as well, MQTT settings are still there. Weird.

  • I would go to 3.7.0. We're not doing fixes on 3.6.5 and there's a chance it works in the latest dev version

As I am not on site, this is something I can only do in app. 2 Weeks time.

MichaelDvP commented 1 month ago

BTW: When going to 3.7.0-dev, use the ESP32 version, NOT the ESP32-16M! I know your S32 has 16Mflash, but no PSRAM. The 16M version default to E32V2 hardware with PSRAM, @proddy Also for the update page, check PSram.

proddy commented 1 month ago

BTW: When going to 3.7.0-dev, use the ESP32 version, NOT the ESP32-16M! I know your S32 has 16Mflash, but no PSRAM. The 16M version default to E32V2 hardware with PSRAM, @proddy Also for the update page, check PSram.

Think it's fixed in the latest PR (https://github.com/emsesp/EMS-ESP32/pull/1931). See getPlatform() in UploadDownload.tsx. I just remove 16M if its an S3.

MichaelDvP commented 1 month ago

I just remove 16M if its an S3.

Yes i've seen it, but the S32 gateways (without S3) have 16M flash without PSRAM, It can't handle the 16M file. The 16M file is compiled with -DEMSESP_DEFAULT_BOARD_PROFILE="E32V2 and defaults on factory reset to the wrong board-pins when flashed to S32.