emsesp / EMS-ESP

ESP8266 firmware to read and control EMS and Heatronic compatible equipment such as boilers, thermostats, solar modules, and heat pumps
https://emsesp.github.io/docs
GNU Lesser General Public License v3.0
303 stars 97 forks source link

Error when publishing thermostat values with multiple heating circuits #264

Closed vanbogaertetienne closed 4 years ago

vanbogaertetienne commented 4 years ago

Bug description After upgrading to 1.9.4 today, I don't get any thermostat_data published to my MQTT server anymore. This was an intermittent error on 1.9.2, but since this upgrade it seems to happen on every publish (manually or automatically). Is it possible this is because the string becomes too long to be sent?

[MQTT] Error publishing to ems-esp/thermostat_data with payload {"hc1":{"seltemp":19.5,"currtemp":19.4,"daytemp":19.5,"nighttemp":16.5,"holidayttemp":14,"heatingtype":1,"circuitcalctemp":5,"mode":"auto"},"hc2":{"seltemp":19.5,"currtemp":19.4,"daytemp":19.5,"nighttemp":16.5,"holidayttemp":14,"heatingtype":1,"circuitcalctemp":5,"mode":"auto"},"hc3":{"seltemp":19.5,"currtemp":19.4,"daytemp":19.5,"nighttemp":16.5,"holidayttemp":14,"heatingtype":3,"circuitcalctemp":5,"mode":"auto"}} [error 0]

Steps to reproduce Upgrade to 1.9.4 with 3 hc's

Expected behavior Publish should succeed, while it fails now

Device information system ESP8266 System stats:

[APP] EMS-ESP version: 1.9.4 [APP] MyESP version: 1.2.22 [APP] Build timestamp: 2019-12-15 23:13:29 [APP] Uptime: 0 days 0 hours 7 minutes 6 seconds [APP] System Load: 1% [WIFI] WiFi Hostname: ems-esp [WIFI] WiFi IP: 10.0.1.132 [WIFI] WiFi signal strength: 100% [WIFI] WiFi MAC: DC:4F:22:5E:95:89 [MQTT] is connected (heartbeat disabled) [SYSTEM] System is Stable [SYSTEM] Board: PLATFORMIO_D1_MINI [SYSTEM] CPU frequency: 80 MHz [SYSTEM] SDK version: 2.2.2-dev(38a443e) [SYSTEM] CPU chip ID: 0x5E9589 [SYSTEM] Core version: 2_6_2 [SYSTEM] Boot version: 31 [SYSTEM] Boot mode: 1 [SYSTEM] Last reset reason: Restart from terminal [SYSTEM] Restart count: 0 [SYSTEM] # TCP disconnects: 0 [SYSTEM] rtcmem status: blocks:2 addr:0x60001280 [SYSTEM] rtcmem 00: 1163087990 [SYSTEM] rtcmem 01: 65536 [FLASH] Flash chip ID: 0x16400E [FLASH] Flash speed: 40000000 Hz [FLASH] Flash mode: DIO [FLASH] Flash size (CHIP): 4194304 [FLASH] Flash size (SDK): 4194304 [FLASH] Flash Reserved: 4096 [MEM] Firmware size: 614912 [MEM] Max OTA size: 2523136 [MEM] OTA Reserved: 16384 [MEM] Free Heap: 24128 bytes initially | 14832 bytes used (61%) | 9296 bytes free (38%)

Info: EMS-ESP system stats: System logging set to None LED: on, Listen mode: off Boiler: enabled, Thermostat: enabled, Solar Module: disabled, Mixing Module: enabled Shower Timer: disabled, Shower Alert: disabled

EMS Bus stats: Bus is connected, protocol: Buderus Rx: # successful read requests=21, # CRC errors=0 Tx: Last poll=2.141 seconds ago, # successful write requests=0

Boiler stats: Boiler: Nefit Topline/Buderus GB162 (DeviceID:0x08 ProductID:115 Version:03.06) Hot tap water: off Central heating: off Warm Water activated: on Warm Water circulation pump available: off Warm Water comfort setting: Hot Warm Water selected temperature: 50 C Warm Water desired temperature: 60 C Warm Water current temperature: 54.6 C Warm Water current tap water flow: 0.0 l/min Warm Water # starts: 5917 times Warm Water active time: 70 days 1 hours 42 minutes Warm Water 3-way valve: off Selected flow temperature: 7 C Current flow temperature: 29.8 C Return temperature: 31.5 C Gas: off Boiler pump: off Fan: off Ignition: off Circulation pump: off Burner selected max power: 0 % Burner current power: 0 % Flame current: 0.0 uA System pressure: 1.0 bar System service code: 0H (203) Heating temperature setting on the boiler: 90 C Boiler circuit pump modulation max power: 100 % Boiler circuit pump modulation min power: 30 % Outside temperature: 7.5 C Boiler temperature: 54.6 C Pump modulation: 0 % Burner # starts: 43296 times Total burner operating time: 726 days 4 hours 5 minutes Total heat operating time: 656 days 2 hours 23 minutes Total UBA working time: 3365 days 22 hours 56 minutes

Thermostat stats: Thermostat: RC35 (DeviceID:0x10 ProductID:86 Version:21.08) Thermostat time is 21:47:26 22/12/2019 Heating Circuit 1 Current room temperature: ? C Setpoint room temperature: ? C Program is set to Summer mode Day temperature: 19.5 C Night temperature: 16.5 C Vacation temperature: 14.0 C Mode is set to auto Heating Circuit 2 Current room temperature: 19.3 C Setpoint room temperature: 19.5 C Day temperature: 19.5 C Night temperature: 16.5 C Vacation temperature: 14.0 C Mode is set to auto Day Mode is set to day Heating Circuit 3 Current room temperature: 19.3 C Setpoint room temperature: 19.5 C Day temperature: 19.5 C Night temperature: 16.5 C Vacation temperature: 14.0 C Mode is set to auto Day Mode is set to day

Mixing module stats: Switch temperature: 31.6 C

proddy commented 4 years ago

Could you try the latest 1.9.5 from the dev branch? I made some changes to stop Mqtt flooding. Make sure you set ‘publish_time’ to something like 10 seconds.

vanbogaertetienne commented 4 years ago

Hi Proddy,

I tried with the latest dev build, and get values for the heating circuits individually when the EMS bus sends it values, but when I ask to publish manually via telnet, it spews out the error. e.g.: when one heating circuit gets published, it is ok, when all heating circuits get published together in one message, I get the error.

For me this behaviour is okay, but the error might need fixing. Is it possible that the message is too long for the ESP8266's memory? With three heating circuits it is a payload of 418 characters?

proddy commented 4 years ago

yes, it may be memory related. I'll need to simulate this and debug to what is causing the failure. I may need to put the mqtt messages on a queue and forget using the asynchronous features of the library

proddy commented 4 years ago

From the MQTT project page:

You can send data as long as you stay below the available TCP window (which is about 3-4kB on the ESP8266). The data is indeed held in memory by the async TCP code until ACK is received. If the TCP window was sufficient to send your packet, the publish method will return a packet ID indicating the packet was sent. Otherwise, a 0 will be returned, and it's your responsibility to resend the packet with publish.

I've added a MQTT failure counter under the system command to see how often it fails. I'll also ad a retry option.

proddy commented 4 years ago

that last patch should fix this issue.