letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.25k stars 2.2k forks source link

MQTT disconnects every minute, re-publishes all data on connect #2840

Open jefft4 opened 4 years ago

jefft4 commented 4 years ago

Checklist

I have...

Steps already tried...

Summarize of the problem/feature request

MQTT disconnects every minute, then immediately reconnects and subscribes to the topics (in this case, domoticz). ESP publishes all sensor data after MQTT connects, causing the sensor to report every minute - excessive.

Need to make the MQTT timeout user-configurable, I think, or at least make it much longer. Why so short? MQTT can stay connected for hours with no problem, if it sends heartbeats occasionally. It should only disconnect if told to, or if it loses the connection.

System configuration

Wifi AP 3m from the ESP, good signal. Zero wifi reconnects shown on status page. MQTT broker is mosquitto, server (cabled) on same subnet as ESP.

ESP Easy version:

Build:⋄20104  - Mega
System Libraries:⋄ESP82xx Core 2_5_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.1.2 PUYA support
Git Build:⋄mega-20191103
Plugins:⋄46 [Normal]
Build Time:⋄Nov  3 2019 03:18:10
Binary Filename:⋄ESP_Easy_mega-20191103_normal_ESP8266_4M1M.bin

Replicated also on build 20191208; exactly the same behaviour. Replicated on three different ESP modules.

Rules

Rules on one ESP are minimal - just a timer to reboot if wifi is disconnected for 5 mins. Rules on the other include 7-segment display of sensor values.

Log data

15371459: EVENT: Rules#Timer=1
15371671: ACT  : 7dt,                
15371677: 7DGT : Show Temperature=0.00
15371691: ACT  : TaskValueSet,12,1,4+1             
15371703: Command: taskvalueset
15371716: ACT  : TaskValueSet,12,1,1
15371728: Command: taskvalueset
15371736: ACT  : timerSet,1,5
15371748: Command: timerset
15377463: EVENT: Rules#Timer=1
15377554: ACT  : 7dt,23.3             
15377560: 7DGT : Show Temperature=23.30
15377696: ACT  : TaskValueSet,12,1,1+1             
15377709: Command: taskvalueset
15377727: ACT  : timerSet,1,5
15377738: Command: timerset
15377778: MQTT : Connection lost, state: Connection timeout
15377779: EVENT: MQTT#Disconnected
15377896: MQTT : Connected to broker with client ID: GymClimate_12
15377897: Subscribed to: domoticz/out
15377900: EVENT: MQTT#Connected
15378123: DHT  : Temperature: 25.10
15378123: DHT  : Humidity: 42.20
15378127: EVENT: DHT#Temperature=23.34
15378205: EVENT: DHT#Humidity=42.20
15378306:  Domoticz: Sensortype: 2 idx: 15 values: 23.3;42.2;1
15379749: DS   : Temperature: 32.44 (28-ff-c4-43-91-16-4-b9)
15379752: EVENT: DA#Temperature=32.44
15379853:  Domoticz: Sensortype: 1 idx: 13 values: 32.4
15379885: DS   : Temperature: 36.63 (28-ff-83-a6-c4-17-4-43)
15379888: EVENT: DB#Temperature=36.63
15379988:  Domoticz: Sensortype: 1 idx: 14 values: 36.6
15383459: EVENT: Rules#Timer=1
15383612: ACT  : 7dn,42.2          
15383617: 7DGT : Show Number=42
15383754: ACT  : TaskValueSet,12,1,2+1             
15383769: Command: taskvalueset
15383845: ACT  : timerSet,1,5
15383857: Command: timerset
15388459: EVENT: Clock#Time=Thu,11:58
15389459: EVENT: Rules#Timer=1
15389663: ACT  : 7dt,32.4            
15389668: 7DGT : Show Temperature=32.40
15389751: ACT  : TaskValueSet,12,1,3+1             
15389764: Command: taskvalueset
15389809: ACT  : timerSet,1,5
15389823: Command: timerset
15392730: WD   : Uptime 257 ConnectFailures 0 FreeMem 18104 WiFiStatus 3
15395466: EVENT: Rules#Timer=1
15395739: ACT  : 7dt,36.6            
15395744: 7DGT : Show Temperature=36.60
15395787: ACT  : TaskValueSet,12,1,4+1             
15395800: Command: taskvalueset
15395839: ACT  : TaskValueSet,12,1,1
15395852: Command: taskvalueset
15395860: ACT  : timerSet,1,5
15395872: Command: timerset
15401459: EVENT: Rules#Timer=1
15401559: ACT  : 7dt,23.3             
15401565: 7DGT : Show Temperature=23.30
15401747: ACT  : TaskValueSet,12,1,1+1             
15401760: Command: taskvalueset
15401805: ACT  : timerSet,1,5
15401818: Command: timerset
15407459: EVENT: Rules#Timer=1
15407610: ACT  : 7dn,42.2          
15407616: 7DGT : Show Number=42
15407749: ACT  : TaskValueSet,12,1,2+1             
15407762: Command: taskvalueset
15407807: ACT  : timerSet,1,5
15407819: Command: timerset
15413463: EVENT: Rules#Timer=1
15413664: ACT  : 7dt,32.4            
15413670: 7DGT : Show Temperature=32.40
15413759: ACT  : TaskValueSet,12,1,3+1             
15413772: Command: taskvalueset
15413819: ACT  : timerSet,1,5
15413831: Command: timerset
15419459: EVENT: Rules#Timer=1
15419705: ACT  : 7dt,36.6            
15419710: 7DGT : Show Temperature=36.60
15419752: ACT  : TaskValueSet,12,1,4+1             
15419765: Command: taskvalueset
15419806: ACT  : TaskValueSet,12,1,1
15419818: Command: taskvalueset
15419827: ACT  : timerSet,1,5
15419840: Command: timerset
15421163: Dummy: value 1: 1.00
15421163: Dummy: value 2: 0.00
15421163: Dummy: value 3: 0.00
15421163: Dummy: value 4: 0.00
15421165: EVENT: MyVars#cycle=1.00
15421243: EVENT: MyVars#=0.00
15421321: EVENT: MyVars#=0.00
15421399: EVENT: MyVars#=0.00
15422729: WD   : Uptime 257 ConnectFailures 0 FreeMem 19064 WiFiStatus 3
15425459: EVENT: Rules#Timer=1
15425552: ACT  : 7dt,23.3             
15425558: 7DGT : Show Temperature=23.30
15425727: ACT  : TaskValueSet,12,1,1+1             
15425739: Command: taskvalueset
15425757: ACT  : timerSet,1,5
15425768: Command: timerset
15431459: EVENT: Rules#Timer=1
15431589: ACT  : 7dn,42.2          
15431594: 7DGT : Show Number=42
15431693: ACT  : TaskValueSet,12,1,2+1             
15431706: Command: taskvalueset
15431725: ACT  : timerSet,1,5
15431737: Command: timerset
15437459: EVENT: Rules#Timer=1
15437630: ACT  : 7dt,32.4            
15437635: 7DGT : Show Temperature=32.40
15437720: ACT  : TaskValueSet,12,1,3+1             
15437734: Command: taskvalueset
15437752: ACT  : timerSet,1,5
15437764: Command: timerset
15440916: MQTT : Connection lost, state: Connection timeout
15440917: EVENT: MQTT#Disconnected
15441032: MQTT : Connected to broker with client ID: GymClimate_12
15441034: Subscribed to: domoticz/out
15441036: EVENT: MQTT#Connected
15441260: DHT  : Temperature: 25.10
15441260: DHT  : Humidity: 42.20
15441264: EVENT: DHT#Temperature=23.34
15441342: EVENT: DHT#Humidity=42.20
15441442:  Domoticz: Sensortype: 2 idx: 15 values: 23.3;42.2;1
sincze commented 5 months ago

image

I post the RSSI every minute to Domoticz would that be sufficient?? I can do 10 seconds. no problem 82523: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'

863082: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
863820: WD : Uptime 14 ConnectFailures 0 FreeMem 14048 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
873076: EVENT: Rules#Timer=1,1
873105: ACT : TimerSet,1,10
873111: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
878313: EVENT: Clock#Time=Mon,19:03
883106: EVENT: Rules#Timer=1,1
883136: ACT : TimerSet,1,10
883142: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'

Mosquitto.conf

persistence true
persistence_location /mosquitto/data/
log_dest file /mosquitto/log/mosquitto.log

# Port to use for the default listener.
listener 1883

listener 9001
protocol websockets

log_timestamp true

# If the password.mqtt file is empty, this has no effect.
allow_anonymous true

# Store info on disk ecery 30 sec
autosave_interval 30
autosave_on_changes false
sincze commented 5 months ago

I still have playroom available I guess by going through the other tickets. image

TD-er commented 5 months ago

Well you are sending every minute and get disconnected every minute. So I wonder if sending this is actually disconecting your client.

I know that Mosquitto's reply to any error is to disconnect the client. For example if you state to send N bytes but send some different amount of data, the response is to get disconnected.

But also if you publish to a topic where the user is not allowed to publish to. (or subscribe to)

You also have the publish retain flag set. Why? It should not be the default to publish with a retain flag set.

sincze commented 5 months ago

Updated: image

I now have a 10 sec timer. 873105: ACT : TimerSet,1,10

TD-er commented 5 months ago

Can you also post your rules?

sincze commented 5 months ago

No Problem.

On System#Boot do
  TimerSet,1,10
Endon

On MQTT#Connected do
  PublishR,homeassistant/binary_sensor/%sysname%_door/config,"{"name": "%sysname%_door","device_class": "opening","state_topic": "ESP_Easy/%sysname%/sensor/door/status","unique_id": "%sysname%_door_%mac%","device": {"identifiers": ["%mac%_1"],"name": "Door_Sensor"}}"  
  PublishR,homeassistant/binary_sensor/%sysname%_contact/config,"{"name": "%sysname%_alarm","device_class": "safety","state_topic": "ESP_Easy/%sysname%/sensor/contact/status","unique_id":"%sysname%_alarm_%mac%","device": {"identifiers": ["%mac%_2"],"name": "Alarm_Sensor"}}"
  PublishR,espeasy-discovery/binary_sensor/%sysname%_door/config,"{"name": "%sysname%_door","device_class": "opening","state_topic": "ESP_Easy/%sysname%/sensor/door/status","unique_id": "%sysname%_door_%mac%","device": {"identifiers": ["%mac%_1"],"name": "Door_Sensor"}}"  
  PublishR,espeasy-discovery/binary_sensor/%sysname%_contact/config,"{"name": "%sysname%_alarm","device_class": "safety","state_topic": "ESP_Easy/%sysname%/sensor/contact/status","unique_id":"%sysname%_alarm_%mac%","device": {"identifiers": ["%mac%_2"],"name": "Alarm_Sensor"}}"
endon

On Deur#State=1 do
  PublishR,ESP_Easy/%sysname%/sensor/door/status,ON
  Publish domoticz/in,'{"command":"switchlight","idx":2786,"switchcmd":"On"}'
Endon

On Deur#State=0 do
  PublishR,ESP_Easy/%sysname%/sensor/door/status,OFF
  Publish domoticz/in,'{"command":"switchlight","idx":2786,"switchcmd":"Off"}'
Endon

On Alarm#State=1 do
  PublishR,ESP_Easy/%sysname%/sensor/contact/status,OFF
  Publish domoticz/in,'{"command":"switchlight","idx":2787,"switchcmd":"On"}'
Endon

On Alarm#State=0 do
  PublishR,ESP_Easy/%sysname%/sensor/contact/status,ON
  Publish domoticz/in,'{"command":"switchlight","idx":2787,"switchcmd":"Off"}'
Endon

On Rules#Timer=1 do
  TimerSet,1,10                             
  Publish domoticz/in,'{"command":"udevice", "idx":1526, "svalue":"[Jablotron_RSSI#rssi]"}'
Endon
sincze commented 5 months ago

At the moment it runs like

1204325: EVENT: Rules#Timer=1,1
1204354: ACT : TimerSet,1,10
1204360: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1214435: EVENT: Rules#Timer=1,1
1214465: ACT : TimerSet,1,10
1214471: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1222526: EVENT: Jablotron_RSSI#rssi=-56.00
1223820: WD : Uptime 20 ConnectFailures 0 FreeMem 14472 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1224466: EVENT: Rules#Timer=1,1
1224500: ACT : TimerSet,1,10
1224506: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1234501: EVENT: Rules#Timer=1,1
1234532: ACT : TimerSet,1,10
1234538: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1238265: EVENT: Clock#Time=Mon,19:09
1244533: EVENT: Rules#Timer=1,1
1244565: ACT : TimerSet,1,10
1244571: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1253820: WD : Uptime 21 ConnectFailures 0 FreeMem 14376 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1254566: EVENT: Rules#Timer=1,1
1254595: ACT : TimerSet,1,10
1254602: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1264634: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1264650: MQTT : Connection lost, state: Disconnected
1264680: MQTT : Connected to broker with client ID: Jablotron
1264685: Subscribed to: ESP_Easy/Jablotron/in/#
1264689: EVENT: MQTT#Disconnected
1264733: EVENT: MQTT#Connected
1264749: ACT : PublishR,homeassistant/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','state_t
1264769: ACT : PublishR,homeassistant/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','stat
1264785: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','sta
1264800: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','
1274629: EVENT: Rules#Timer=1,1
1274658: ACT : TimerSet,1,10
1274664: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1282530: EVENT: Jablotron_RSSI#rssi=-57.00
1283820: WD : Uptime 21 ConnectFailures 0 FreeMem 16984 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1284660: EVENT: Rules#Timer=1,1
1284689: ACT : TimerSet,1,10
1284695: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
1294691: EVENT: Rules#Timer=1,1
1294719: ACT : TimerSet,1,10
1294726: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
1298265: EVENT: Clock#Time=Mon,19:10
1304720: EVENT: Rules#Timer=1,1
1304750: ACT : TimerSet,1,10
1304756: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
1313821: WD : Uptime 22 ConnectFailures 0 FreeMem 14480 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1314751: EVENT: Rules#Timer=1,1
1314780: ACT : TimerSet,1,10
1314786: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
1324817: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
1324833: MQTT : Connection lost, state: Disconnected
1324890: MQTT : Connected to broker with client ID: Jablotron
1324895: Subscribed to: ESP_Easy/Jablotron/in/#
1324899: EVENT: MQTT#Disconnected
1324943: EVENT: MQTT#Connected
1324959: ACT : PublishR,homeassistant/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','state_t
1324978: ACT : PublishR,homeassistant/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','stat
1324996: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','sta
1325012: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','
TD-er commented 5 months ago

So right before the connection gets disconnected the publish command is attempted twice. Can you try to set the timer interval to 5 seconds? I got the feeling your Mosquitto is set to a keep-alive which is too short, so that's what I want to check.

sincze commented 5 months ago

I tried changing Subscribed to: domoticz/out

1565850: MQTT : Connection lost, state: Disconnected
1565893: MQTT : Connected to broker with client ID: Jablotron
1565894: Subscribed to: domoticz/out
1565898: EVENT: MQTT#Disconnected
1565940: EVENT: MQTT#Connected
1565956: ACT : PublishR,homeassistant/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','state_t
1565976: ACT : PublishR,homeassistant/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','stat
1565993: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','sta
1566008: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','
1575830: EVENT: Rules#Timer=1,1
1575860: ACT : TimerSet,1,10
1575871: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1582526: EVENT: Jablotron_RSSI#rssi=-56.00
1583820: WD : Uptime 26 ConnectFailures 0 FreeMem 20520 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1585862: EVENT: Rules#Timer=1,1
1585892: ACT : TimerSet,1,10
1585898: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1595893: EVENT: Rules#Timer=1,1
1595923: ACT : TimerSet,1,10
1595930: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1598264: EVENT: Clock#Time=Mon,19:15
1605925: EVENT: Rules#Timer=1,1
1605954: ACT : TimerSet,1,10
1605961: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1613837: WD : Uptime 27 ConnectFailures 0 FreeMem 13920 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1615956: EVENT: Rules#Timer=1,1
1615988: ACT : TimerSet,1,10
1615994: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1625989: EVENT: Rules#Timer=1,1
1626018: ACT : TimerSet,1,10
1626024: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1636019: EVENT: Rules#Timer=1,1
1636048: ACT : TimerSet,1,10
1636054: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
1642750: EVENT: Jablotron_RSSI#rssi=-58.00
1643900: WD : Uptime 27 ConnectFailures 0 FreeMem 15768 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1646049: EVENT: Rules#Timer=1,1
1646078: ACT : TimerSet,1,10
1646085: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1656080: EVENT: Rules#Timer=1,1
1656109: ACT : TimerSet,1,10
1656115: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1658264: EVENT: Clock#Time=Mon,19:16
1666111: EVENT: Rules#Timer=1,1
1666140: ACT : TimerSet,1,10
1666147: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1673820: WD : Uptime 28 ConnectFailures 0 FreeMem 14384 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1676142: EVENT: Rules#Timer=1,1
1676172: ACT : TimerSet,1,10
1676180: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1686252: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
1686265: MQTT : Connection lost, state: Disconnected
1686299: MQTT : Connected to broker with client ID: Jablotron
1686300: Subscribed to: domoticz/out
1686304: EVENT: MQTT#Disconnected
1686353: EVENT: MQTT#Connected
sincze commented 5 months ago

So right before the connection gets disconnected the publish command is attempted twice. Can you try to set the timer interval to 5 seconds? I got the feeling your Mosquitto is set to a keep-alive which is too short, so that's what I want to check.

Ok 1 moment. I noticed the double ACT as well.

sincze commented 5 months ago

So the client (Jablotron) needs to send the keepAlive to the MQTT broker?

According to what I cold find online: with the help of keepAlive=300, the client told the MQTT broker that the Keep Alive interval is 300 seconds. http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Keep_Alive

Interesting, no disconnects:

1756552: MQTT : Connection lost, state: Disconnected
1756583: MQTT : Connected to broker with client ID: Jablotron
....
2153843: WD : Uptime 36 ConnectFailures 0 FreeMem 15544 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2154105: EVENT: Rules#Timer=1,1
2154134: ACT : TimerSet,1,5
2154140: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
2159137: EVENT: Rules#Timer=1,1
2159165: ACT : TimerSet,1,5
2159171: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
....
2204455: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
2209450: EVENT: Rules#Timer=1,1
2209480: ACT : TimerSet,1,5
2209487: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
....
2393820: WD : Uptime 40 ConnectFailures 0 FreeMem 14656 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2395829: EVENT: Rules#Timer=1,1
2395860: ACT : TimerSet,1,5
2395867: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
....
2857228: ACT : TimerSet,1,8
2857236: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'

Your theory seems to be correct I have a different EPS8266 that monitors the: !serial (causing a lot of messages running Build: ESP_Easy_mega_20240229_normal_ESP8266_4M1M Feb 29 2024.

944740530: EVENT: !Serial#/CTA5ZIV-METER^^^^1-3:0.2.8(50)^^0-0:1.0.0(240311183711W)^^0-0:96.1.1(4530303930303030373331343231333233)^^1-0:1
944740552: ACT : PublishR,ESP_Easy/Watermeter/P1,'!Serial#/CTA5ZIV-METER^^^^1-3:0.2.8(50)^^0-0:1.0.0(240311183711W)^^0-0:96.1.1(453030393

This device is not experiencing these disconnects so it seems.

TD-er commented 5 months ago

The MQTT broker does use a keep-alive interval of N seconds. After 1.5x this N seconds no message from the client, it considers the client to be disconnected. The MQTT client in ESPEasy has a hard-coded keep-alive interval of 10 seconds if my memory serves me well. So perhaps your MQTT broker config somehow uses a way shorter interval? (not sure the exact parameter name is "keep-alive")

sincze commented 5 months ago

Ok. For the moment the disconnects have disappeared. Something else you want me to test?

sincze commented 5 months ago

I am running: https://hub.docker.com/_/eclipse-mosquitto/tags the latest version in a docker environment.

Version: mosquitto version 2.0.18 Uptime: 306196 seconds

image

So

TimerSet,1,9 -> OK
TimerSet,1,10 -> NOK Reconnect
sincze commented 5 months ago

https://github.com/TD-er/ESPEasy/commit/77deb3d8b3f689e5fa28aa8bba322cc5c386dfb0 https://github.com/knolleary/pubsubclient/issues/239

Reading through some tickets to find clues.

Mosquitto.log claims:

1710187049: Client Jablotron closed its connection.
1710187049: New connection from 192.168.xx.xx:63783 on port 1883.
1710187049: New client connected from 192.168.xx.xx:63783 as Jablotron (p2, c0, k10).
TD-er commented 5 months ago

Yep, the keep-alive is set to 10 sec, however this was never a problem. So I wonder what may have changed. And as stated in the discussion you linked, the broker should use 1.5x the keep-alive interval as threshold to disconnect the client.

sincze commented 5 months ago

I'll try to capture a .pcap to see what is in the actual package.

Another thing I can try is having the mqtt server on the same subnet. (now the device is on 192.168..x.40 and the mqtt broker is at 192.168.y.88 however that should not be an issue as other devices do not reconnect (ofcourse I do not know their keep-alive).

Within tasmota I see 1 Shelly 1PM doing the same behaviour with the latest release but the other 12 (different devices same tasmota versions) do not show up in the mosquitto.log. Did you notice anything strange in my mosquitto.conf ?

TD-er commented 5 months ago

well I do notice you don't explicitly set the keep-alive, thus it is using the default. Did you recently update the package? Or using the latest available?

Maybe the default has changed for Mosquitto?

TD-er commented 5 months ago

It is stated here the default keepalive_interval is 60 (seconds) But there's also some mention about MQTT v5 clients which can negotiate a keepalive period which is limited by the max_keepalive. It is unclear to me whether this also may be applicable to MQTT v3.x clients as it is mentioned that it is used to refuse any higher keepalive value suggested by the client.

max_keepalive value For MQTT v5 clients, it is possible to have the server send a "server keepalive" value that will override the keepalive value set by the client. This is intended to be used as a mechanism to say that the server will disconnect the client earlier than it anticipated, and that the client should use the new keepalive value. The max_keepalive option allows you to specify that clients may only connect with keepalive less than or equal to this value, otherwise they will be sent a server keepalive telling them to use max_keepalive. This only applies to MQTT v5 clients. The maximum value allowable, and default value, is 65535.

Set to 0 to allow clients to set keepalive = 0, which means no keepalive checks are made and the client will never be disconnected by the broker if no messages are received. You should be very sure this is the behaviour that you want.

For MQTT v3.1.1 and v3.1 clients, there is no mechanism to tell the client what keepalive value they should use. If an MQTT v3.1.1 or v3.1 client specifies a keepalive time greater than max_keepalive they will be sent a CONNACK message with the "identifier rejected" reason code, and disconnected.

This option applies globally.

Reloaded on reload signal.

So maybe the max_keepalive is now set to 10?

sincze commented 5 months ago

Adding:

# 12-03-2024 Test for EspEasy TD-er
max_keepalive 30

That results into massive:

1710236046: New client connected from 10.0.2.4:40770 as mqttjs_bae5881f (p2, c1, k60).
1710236046: Bad socket read/write on client mqttjs_bae5881f: Invalid arguments provided.

So removed it.

TD-er commented 5 months ago

Maybe also define the keepalive_interval explicitly?

sincze commented 5 months ago

I changed

# Port to use for the default listener.
#listener 1883
listener 1883 0.0.0.0

The ESPEASY interface does not show a MQTT : Connection lost, state: Disconnected now.

61999825: MQTT : Connected to broker with client ID: Jablotron
61999830: Subscribed to: ESP_Easy/Jablotron/in/#
61999839: EVENT: MQTT#Disconnected
61999958: EVENT: MQTT#Connected
61999973: ACT : PublishR,homeassistant/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','state_t
61999990: ACT : PublishR,homeassistant/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','stat
62000005: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_door/config,'{'name': 'Jablotron_door','device_class': 'opening','sta
62000021: ACT : PublishR,espeasy-discovery/binary_sensor/Jablotron_contact/config,'{'name': 'Jablotron_alarm','device_class': 'safety','
62033820: WD : Uptime 1034 ConnectFailures 0 FreeMem 14856 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
62039888: EVENT: Rules#Timer=1,1
62039917: ACT : TimerSet,1,30
62039923: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
62078264: EVENT: Clock#Time=Tue,12:03
62093993: WD : Uptime 1035 ConnectFailures 0 FreeMem 14160 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
62100245: EVENT: Rules#Timer=1,1
62100274: ACT : TimerSet,1,30
62100281: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
62123821: WD : Uptime 1035 ConnectFailures 0 FreeMem 15472 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
62153837: WD : Uptime 1036 ConnectFailures 0 FreeMem 14320 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
62182563: EVENT: Jablotron_RSSI#rssi=-60.00
62183926: WD : Uptime 1036 ConnectFailures 0 FreeMem 14208 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
62190339: EVENT: Rules#Timer=1,1
62190369: ACT : TimerSet,1,30
62190377: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'

Mosquitto.log only mentions:

1710238223: Client Jablotron closed its connection.
1710238223: New connection from 192.168.*.*:62795 on port 1883.
1710238223: New client connected from 192.168.*.*:62795 as Jablotron (p2, c0, k10).
1710238223: New client connected from 192.168.*.*:62795 as Jablotron (p2, c0, k10).
1710238293: Client Watermeter closed its connection.
1710238293: New connection from 192.168.*.*:63095 on port 1883.
1710238293: New client connected from 192.168.*.*:63095 as Watermeter (p2, c0, k10).
tonhuisman commented 5 months ago

That looks like the ESPEasy web-log, can you (also) get the logging from USB-serial? That's usually more complete, the javascript seems to 'lose' some of the log-lines 😞

sincze commented 5 months ago

That looks like the ESPEasy web-log, can you (also) get the logging from USB-serial? That's usually more complete, the javascript seems to 'lose' some of the log-lines 😞

If you would be so kind to point me into the direction on how to achieve that ;-)

TD-er commented 5 months ago

You can use just about any serial program. Putty is a good one to use The default baud rate is 115200.

Or you can use the web flasher to connect to the ESP and click to use the console instead of flashing.

sincze commented 5 months ago

You can use just about any serial program. Putty is a good one to use The default baud rate is 115200.

Or you can use the web flasher to connect to the ESP and click to use the console instead of flashing.

ah now I understand, that would mean a physical connection to obtain a comport. Currently that is not possible for me.

Let me try if I can set this up with "communication gateway" to do it via LAN and putty.

TD-er commented 5 months ago

You can also try sending to a syslog server. Or if you want to keep using web log, you need to keep the tab open and in focus to be sure the tab doesn't get put to sleep. (especially Chrome does this)

sincze commented 5 months ago

You can also try sending to a syslog server. Or if you want to keep using web log, you need to keep the tab open and in focus to be sure the tab doesn't get put to sleep. (especially Chrome does this)

Ok good tips I can work with that.

sincze commented 5 months ago

image Ok syslog is running

TD-er commented 5 months ago

Did you already see some keep-alive packets being sent by ESPEasy when you don't send a message every N seconds? There have been a few reports by others that they also experienced MQTT disconnects but those were then not triggering extra attention as something that may have changed in ESPEasy as those also have had some WiFi stability issues before.

Did you get Mosquitto installed along with some other package like Domoticz or did you install it yourself?

sincze commented 5 months ago

Mosquitto is installed as a docker container. It communicates with other dockers like Node-Red / Homeassistant / ZWAVE-JS-UI / ZIGBEE2MQTT and hardware Tasmota / EspEasy. Also Domoticz is installed (bare-metal and in a docker).

TD-er commented 5 months ago

You didn't make this Docker yourself I assume? So maybe this is some Docker container which is using a different default?

sincze commented 5 months ago

You didn't make this Docker yourself I assume? So maybe this is some Docker container which is using a different default?

Correct: image: eclipse-mosquitto Docker Official Image 500M+ Downloads

image

Rather the same as the webinterface. No keep-alive packets being sent by ESPEasy

TD-er commented 5 months ago

keepalive packets will not be logged. So to see if ESPEasy actually does send those, you might need to look into logs from the broker or capture network traffic.

sincze commented 5 months ago

I do find it fascinating that the ESP needs to deal with: image

domoticz/out messages it has nothing to do with. image

Will go through the settings here as well: https://github.com/eclipse/mosquitto/blob/master/mosquitto.conf

In addition I rolled-out a new docker on a different Pi with mosquitto and have 'Jablotron' communicate with that one. mosquitto version 2.0.18

Well wiki how about that:

2761645: EVENT: Jablotron_RSSI#rssi=-58.00
2763980: WD : Uptime 46 ConnectFailures 0 FreeMem 19488 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2768829: EVENT: Rules#Timer=1,1
2768858: ACT : TimerSet,1,60
2768867: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
2793980: WD : Uptime 47 ConnectFailures 0 FreeMem 19760 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2794425: EVENT: Clock#Time=Tue,21:54
2821645: EVENT: Jablotron_RSSI#rssi=-57.00
2823981: WD : Uptime 47 ConnectFailures 0 FreeMem 20088 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2828860: EVENT: Rules#Timer=1,1
2828890: ACT : TimerSet,1,60
2828896: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
2853980: WD : Uptime 48 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2854547: EVENT: Clock#Time=Tue,21:55
2881644: EVENT: Jablotron_RSSI#rssi=-58.00
2883980: WD : Uptime 48 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2888891: EVENT: Rules#Timer=1,1
2888923: ACT : TimerSet,1,60
2888929: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
2913980: WD : Uptime 49 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2914425: EVENT: Clock#Time=Tue,21:56
2941643: EVENT: Jablotron_RSSI#rssi=-59.00
2943980: WD : Uptime 49 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2948924: EVENT: Rules#Timer=1,1
2948954: ACT : TimerSet,1,60
2948960: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
2973980: WD : Uptime 50 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
2974425: EVENT: Clock#Time=Tue,21:57
3001643: EVENT: Jablotron_RSSI#rssi=-57.00
3003980: WD : Uptime 50 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3008955: EVENT: Rules#Timer=1,1
3008985: ACT : TimerSet,1,60
3008991: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
3033980: WD : Uptime 51 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3034526: EVENT: Clock#Time=Tue,21:58
3061644: EVENT: Jablotron_RSSI#rssi=-58.00
3063980: WD : Uptime 51 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3068986: EVENT: Rules#Timer=1,1
3069017: ACT : TimerSet,1,60
3069024: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-58.00'}'
3093980: WD : Uptime 52 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3094425: EVENT: Clock#Time=Tue,21:59
3121644: EVENT: Jablotron_RSSI#rssi=-56.00
3123980: WD : Uptime 52 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3129019: EVENT: Rules#Timer=1,1
3129049: ACT : TimerSet,1,60
3129055: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
3153980: WD : Uptime 53 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3154425: EVENT: Clock#Time=Tue,22:00
3181643: EVENT: Jablotron_RSSI#rssi=-56.00
3183980: WD : Uptime 53 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3189051: EVENT: Rules#Timer=1,1
3189082: ACT : TimerSet,1,60
3189089: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
3214015: WD : Uptime 54 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3214425: EVENT: Clock#Time=Tue,22:01
3241646: EVENT: Jablotron_RSSI#rssi=-57.00
3243980: WD : Uptime 54 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3249084: EVENT: Rules#Timer=1,1
3249114: ACT : TimerSet,1,60
3249122: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
3273981: WD : Uptime 55 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3274425: EVENT: Clock#Time=Tue,22:02
3301643: EVENT: Jablotron_RSSI#rssi=-57.00
3303980: WD : Uptime 55 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3309117: EVENT: Rules#Timer=1,1
3309147: ACT : TimerSet,1,60
3309153: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-57.00'}'
3333980: WD : Uptime 56 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3334425: EVENT: Clock#Time=Tue,22:03
3361645: EVENT: Jablotron_RSSI#rssi=-56.00
3363980: WD : Uptime 56 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3369148: EVENT: Rules#Timer=1,1
3369180: ACT : TimerSet,1,60
3369186: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
3393980: WD : Uptime 57 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3394425: EVENT: Clock#Time=Tue,22:04
3421644: EVENT: Jablotron_RSSI#rssi=-56.00
3423980: WD : Uptime 57 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3429181: EVENT: Rules#Timer=1,1
3429210: ACT : TimerSet,1,60
3429217: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-56.00'}'
3453980: WD : Uptime 58 ConnectFailures 0 FreeMem 20232 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
3454425: EVENT: Clock#Time=Tue,22:05

image

Applied the same config of the working mosquitto to the time-out mosquitto but that did not resolve the issue.

TD-er commented 5 months ago

Just for the record, which version of Mosquitto did you run on the fetched Docker?

sincze commented 5 months ago

Just for the record, which version of Mosquitto did you run on the fetched Docker?

mosquitto version 2.0.18 on both latest version.

sincze commented 5 months ago

What interests me most is why only ESPEASY and not the other devices (like tasmota) Log from 00:00 - 09:00 (publish timer at 9 sec)

1710316040: New client connected from 192.168.*.*:53556 as SHELLY_P1_B1D3FA (p2, c1, k30, u'DVES_USER').
1710316194: New connection from 192.168.*.*:33481 on port 1883.
1710316194: New client connected from 192.168.*.*:33481 as auto-7CB139D0-AAC7-A83C-1A7B-37302542CC96 (p2, c1, k60).
1710316194: Client auto-7CB139D0-AAC7-A83C-1A7B-37302542CC96 closed its connection.
1710316282: Client Watermeter closed its connection.
1710316282: New connection from 192.168.2.25:52151 on port 1883.
1710316282: New client connected from 192.168.2.25:52151 as Watermeter (p2, c0, k10).
1710316673: Client SHELLY_P1_B1D3FA closed its connection.
1710316674: New connection from 192.168.*.*:50465 on port 1883.
1710316674: New client connected from 192.168.*.*:50465 as SHELLY_P1_B1D3FA (p2, c1, k30, u'DVES_USER').

Log from 09:00 - xx (publish timer at 30 sec)

1710316040: New client connected from 192.168.*.*:53556 as SHELLY_P1_B1D3FA (p2, c1, k30, u'DVES_USER').
1710316194: New connection from 192.168.*.*:33481 on port 1883.
1710316194: New client connected from 192.168.*.*:33481 as auto-7CB139D0-AAC7-A83C-1A7B-37302542CC96 (p2, c1, k60).
1710316194: Client auto-7CB139D0-AAC7-A83C-1A7B-37302542CC96 closed its connection.
1710316282: Client Watermeter closed its connection.
1710316282: New connection from 192.168.2.25:52151 on port 1883.
1710316282: New client connected from 192.168.2.25:52151 as Watermeter (p2, c0, k10).
1710316673: Client SHELLY_P1_B1D3FA closed its connection.
1710316674: New connection from 192.168.*.*:50465 on port 1883.
1710316674: New client connected from 192.168.*.*:50465 as SHELLY_P1_B1D3FA (p2, c1, k30, u'DVES_USER').
1710316791: Client Jablotron closed its connection.
1710316791: New connection from 192.168.*.*:55779 on port 1883.
1710316791: New client connected from 192.168.*.*:55779 as Jablotron (p2, c0, k10).
1710316942: Client Jablotron closed its connection.
1710316942: New connection from 192.168.*.*:53805 on port 1883.
1710316942: New client connected from 192.168.*.*:53805 as Jablotron (p2, c0, k10).
1710316992: Saving in-memory database to /mosquitto/data//mosquitto.db.
1710317002: Client Jablotron closed its connection.
1710317002: New connection from 192.168.*.*:61543 on port 1883.
1710317002: New client connected from 192.168.*.*:61543 as Jablotron (p2, c0, k10).
1710317087: Client Watermeter closed its connection.
1710317087: New connection from 192.168.*.*:58307 on port 1883.
1710317087: New client connected from 192.168.*.*:58307 as Watermeter (p2, c0, k10).
1710317141: Client Jablotron closed its connection.
1710317142: New connection from 192.168.*.*:61544 on port 1883.
1710317142: New client connected from 192.168.*.*:61544 as Jablotron (p2, c0, k10).
TD-er commented 5 months ago

Maybe Tasmota is using a shorter keep-alive setting?

sincze commented 5 months ago

Maybe Tasmota is using a shorter keep-alive setting?

Mmm 15?

https://github.com/arendst/Tasmota/blob/26a3eacbd63c04d05dbabd83a2189b2da39b5f11/lib/default/pubsubclient-2.8.13/src/PubSubClient.h

image


// MQTT_KEEPALIVE : keepAlive interval in Seconds. Override with setKeepAlive()
#ifndef MQTT_KEEPALIVE
#define MQTT_KEEPALIVE 15
#endif

// MQTT_SOCKET_TIMEOUT: socket timeout interval in Seconds. Override with setSocketTimeout()
#ifndef MQTT_SOCKET_TIMEOUT
#define MQTT_SOCKET_TIMEOUT 15
#endif

v2.8 Tasmota https://github.com/arendst/Tasmota/blob/26a3eacbd63c04d05dbabd83a2189b2da39b5f11/lib/default/pubsubclient-2.8.13/CHANGES.txt

v2.7 ESPEasy https://github.com/letscontrolit/ESPEasy/blob/8cf90505608dd0e79299f55457e6252dec2998fe/lib/pubsubclient/CHANGES.txt

tonhuisman commented 5 months ago

It might be interesting to follow this PubSubClient issue: https://github.com/knolleary/pubsubclient/issues/1045 🤔

sincze commented 5 months ago

I noticed a lot of traffic before I see the reconnect. (domoticz/out) image

A "TCP window full" condition, could means that the receiving end of the TCP connection (Espeasy0 has filled its buffer and cannot accept any more data at the moment. This situation typically occurs when the receiver is unable to process data quickly enough, causing its buffer to become full.

ESPEasy is responding to mqtt with a "TCP ZeroWindow" this could mean that the receiver's buffer for incoming data is completely full, and it cannot accept any more data at the moment.

image No Clue what the link would be with domoticz/out

2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (392 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:36: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (450 bytes))
2024-03-13T19:14:37: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (330 bytes))
2024-03-13T19:14:37: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (334 bytes))
2024-03-13T19:14:38: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (279 bytes))
2024-03-13T19:14:39: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (319 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (392 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (286 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (280 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (281 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (450 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (364 bytes))
2024-03-13T19:14:41: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (334 bytes))
2024-03-13T19:14:42: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (330 bytes))
2024-03-13T19:14:42: Sending PUBLISH to Jablotron (d0, q0, r0, m0, 'domoticz/out', ... (293 bytes))
TD-er commented 5 months ago

Maybe related to some QoS flags? MQTT QoS flags define guarantees about subscribed clients receiving published messages. I could have a look to see is something has recently changed in controller queue handling where it might try to push out messages more than processing received messages? Do you also subscribe to topics to which your devices also publish? Or only to topic(s) where Domoticz publishes to?

Maybe Domoticz recently changed its QoS flags when sending messages?

Maybe there is more memory reserved on the 'working' docker image for delivering messages to subscribers? Maybe saving the state every 30 seconds is a bit too much? (in Mosquitto)

sincze commented 5 months ago

Maybe related to some QoS flags? MQTT QoS flags define guarantees about subscribed clients receiving published messages. I could have a look to see is something has recently changed in controller queue handling where it might try to push out messages more than processing received messages? Do you also subscribe to topics to which your devices also publish? Or only to topic(s) where Domoticz publishes to?

Maybe Domoticz recently changed its QoS flags when sending messages?

Maybe there is more memory reserved on the 'working' docker image for delivering messages to subscribers? Maybe saving the state every 30 seconds is a bit too much? (in Mosquitto)

The working docker has no active domoticz attached to it :)

But question why would my espeasy (jablotron) be interested in domoticz/out ??

It only publishes to different topics and domoticz/in

TD-er commented 5 months ago

The Domoticz MQTT controller subscribes to it so it can receive messages sent by Domoticz.

However it was a rather poor scaling design decision of Domoticz as you all publish to the same topic and subscribe to the same topic.

sincze commented 5 months ago

The Domoticz MQTT controller subscribes to it so it can receive messages sent by Domoticz.

However it was a rather poor scaling design decision of Domoticz as you all publish to the same topic and subscribe to the same topic.

You mean the MQTT Controller in ESPEasy? I am using the OpenHAB. Screenshot_20240313_214832_Chrome.jpg

I used to use the Domoticz one in ESPEasy but with my rules I stopped using it.

sincze commented 5 months ago

Screenshot_20240313_215539_Chrome.jpg

If my assumption would be right... there is a dormant subscription to domoticz/out...

I again added a Domoticz MQTT Controller Added domoticz2/out to subsribe to.

Disabled openhab mqtt. Rebooted the esp.

Now it should continue to work....

TD-er commented 5 months ago

Or you can check the checkbox "Unique Client ID on Reconnect:" or if your setup doesn't allow for unique client IDs, you could try "Clean Session:"

sincze commented 5 months ago

image

The theory still stands. No Timeouts after modification. (Mod is to re-add domoticz mqtt and the non existent topic)

So it could be the issue that:

Log after modification! No Timeouts

942720: EVENT: Rules#Timer=1,1
942749: ACT : TimerSet,1,30
942755: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
961665: EVENT: Jablotron_RSSI#rssi=-59.00
963969: WD : Uptime 16 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
972750: EVENT: Rules#Timer=1,1
972779: ACT : TimerSet,1,30
972786: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
982414: EVENT: Clock#Time=Wed,23:08
993969: WD : Uptime 17 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1002781: EVENT: Rules#Timer=1,1
1002813: ACT : TimerSet,1,30
1002819: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
1021668: EVENT: Jablotron_RSSI#rssi=-61.00
1023969: WD : Uptime 17 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1032814: EVENT: Rules#Timer=1,1
1032843: ACT : TimerSet,1,30
1032850: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1042414: EVENT: Clock#Time=Wed,23:09
1053969: WD : Uptime 18 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1062845: EVENT: Rules#Timer=1,1
1062877: ACT : TimerSet,1,30
1062883: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1081666: EVENT: Jablotron_RSSI#rssi=-60.00
1083969: WD : Uptime 18 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1093008: EVENT: Rules#Timer=1,1
1093037: ACT : TimerSet,1,30
1093045: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1102414: EVENT: Clock#Time=Wed,23:10
1113969: WD : Uptime 19 ConnectFailures 0 FreeMem 20448 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1123038: EVENT: Rules#Timer=1,1
1123067: ACT : TimerSet,1,30
1123074: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1126199: ESPEasy console using ESPEasySerial
1141667: EVENT: Jablotron_RSSI#rssi=-60.00
1143969: WD : Uptime 19 ConnectFailures 0 FreeMem 21024 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1153069: EVENT: Rules#Timer=1,1
1153099: ACT : TimerSet,1,30
1153106: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1162414: EVENT: Clock#Time=Wed,23:11
1173969: WD : Uptime 20 ConnectFailures 0 FreeMem 20920 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1183100: EVENT: Rules#Timer=1,1
1183130: ACT : TimerSet,1,30
1183136: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1201664: EVENT: Jablotron_RSSI#rssi=-60.00
1203969: WD : Uptime 20 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1213131: EVENT: Rules#Timer=1,1
1213160: ACT : TimerSet,1,30
1213167: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1222414: EVENT: Clock#Time=Wed,23:12
1233969: WD : Uptime 21 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1243162: EVENT: Rules#Timer=1,1
1243194: ACT : TimerSet,1,30
1243200: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1261664: EVENT: Jablotron_RSSI#rssi=-60.00
1263996: WD : Uptime 21 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1273195: EVENT: Rules#Timer=1,1
1273224: ACT : TimerSet,1,30
1273231: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1282414: EVENT: Clock#Time=Wed,23:13
1293969: WD : Uptime 22 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1303226: EVENT: Rules#Timer=1,1
1303257: ACT : TimerSet,1,30
1303264: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-60.00'}'
1321662: EVENT: Jablotron_RSSI#rssi=-61.00
1323969: WD : Uptime 22 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1333259: EVENT: Rules#Timer=1,1
1333288: ACT : TimerSet,1,30
1333295: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1342414: EVENT: Clock#Time=Wed,23:14
1353969: WD : Uptime 23 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1363290: EVENT: Rules#Timer=1,1
1363322: ACT : TimerSet,1,30
1363329: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1381667: EVENT: Jablotron_RSSI#rssi=-61.00
1383969: WD : Uptime 23 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1393323: EVENT: Rules#Timer=1,1
1393352: ACT : TimerSet,1,30
1393359: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1402521: EVENT: Clock#Time=Wed,23:15
1413969: WD : Uptime 24 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1423354: EVENT: Rules#Timer=1,1
1423385: ACT : TimerSet,1,30
1423392: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-61.00'}'
1441665: EVENT: Jablotron_RSSI#rssi=-59.00
1443970: WD : Uptime 24 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1453387: EVENT: Rules#Timer=1,1
1453417: ACT : TimerSet,1,30
1453423: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
1462414: EVENT: Clock#Time=Wed,23:16
1473969: WD : Uptime 25 ConnectFailures 0 FreeMem 20632 WiFiStatus: 3 ESPeasy internal wifi status: Conn. IP Init
1483418: EVENT: Rules#Timer=1,1
1483449: ACT : TimerSet,1,30
1483456: ACT : Publish domoticz/in,'{'command':'udevice', 'idx':1526, 'svalue':'-59.00'}'
TD-er commented 5 months ago

And what if you check the checkbox to start with a clean session? Not sure how you can unsubscribe to a topic you're not subscribed to? Maybe for each incoming message looking to see if you are subscribed to it and if not actively unsubscribe? However I think it should be the broker's responsibility to get rid of subscriptions when a client is disconnected or gets reconnected.