Closed giig1967g closed 5 years ago
Client IP is your own IP.
I have seen some routers (even data center level grade) refusing routes when their clients did not perform regular DHCP request. This was mainly an issue with IPv6, but maybe at a lower level something similar is happening here. Maybe the ESP core libraries don't send some needed packets at low level (like ARP packets) when DHCP is disabled.
One other user reported similar issues were not happening anymore since he upgraded the core libraries to some newer version than currently included in platformio.
hi, I will test with DHCP and report if the problem persists.
maybe have same issue, setted static ip but it is always fetched from dhcp server and used from there. I will try to find where could be a problem but first need to find some time...
śr., 15.08.2018, 00:02 użytkownik Plebs notifications@github.com napisał:
hi, I will test with DHCP and report if the problem persists.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/letscontrolit/ESPEasy/issues/1640#issuecomment-413030810, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHOUx5hZ96QBf_MXByPBP6w9wFuAtiNks5uQ0j1gaJpZM4V5GBe .
śr., 15.08.2018, 00:02 użytkownik Plebs notifications@github.com napisał:
hi, I will test with DHCP and report if the problem persists.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/letscontrolit/ESPEasy/issues/1640#issuecomment-413030810, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHOUx5hZ96QBf_MXByPBP6w9wFuAtiNks5uQ0j1gaJpZM4V5GBe .
Hi all, I have been testing the problem for a long time with several units and configurations.
The result is: UNIT with 20180826 (latest) with MQTT disabled. After a couple of days of operation, suddenly it lost connection with the external world (no ping, no web access). Also according to the router the unit was connected (no sign in router's log of the disconnection). So I waited a couple of hours and then I switched on and off wireless radio in the router and all of a sudden the unit rebooted by itself. Reset reqason:
Boot | Manual reboot (10)
Reset Reason | Exception
How can toggling the router wireless radio make the unit reboot?
I forgot: it's configured as DHCP. Router is: MIKROTIK
More test results in this case with MQTT. Other two nodes with the same version of firmware (but with MQTT enabled) have the following behaviour: they continuously disconnect from MQTT and reconnect. Sometimes to reconnect it takes few miliseconds, sometimes it takes minutes. Nothing appears in the routers log. When MQTT is disconnected I cannot access the webserver nor will ping respond. Again the router is Mikrotik.
I have another similar setup with an ASUS router and I don't have the same problems.
See mosquitto.log file (I have literally thousands of those entries in my log):
1535636813: Client ESPT3_3 has exceeded timeout, disconnecting.
1535636813: Socket error on client ESPT3_3, disconnecting.
1535636814: New connection from 192.168.88.203 on port 1883.
1535636814: New client connected from 192.168.88.203 as ESPT3_3 (c1, k10, u'openhabian').
1535636964: Client ESPT3_3 has exceeded timeout, disconnecting.
1535636964: Socket error on client ESPT3_3, disconnecting.
1535636965: New connection from 192.168.88.203 on port 1883.
1535636965: New client connected from 192.168.88.203 as ESPT3_3 (c1, k10, u'openhabian').
Adding more clues: when the above situation happens, if i switch off and on the wireless radio of the router, the units that were disconnected do reboot and start working again.
Another clue: in one single unit, in the last two hours I got 21 MQTT#Connect / Disconnect events and 2 Wifi#Connect/Disconnect events.
In case of WiFi disconnections, one was reported also by router's log with the following description:
XX:XX:XX:XX:XX:XX@wlan1: disconnected, received deauth: sending station leaving (3)
And the other was not tracked in the router's log.
In case of MQTT all errors were reported in the following way:
1535638792: Client ESPT3_3 has exceeded timeout, disconnecting.
1535638792: Socket error on client ESPT3_3, disconnecting.
1535638799: New connection from 192.168.88.203 on port 1883.
1535638799: New client connected from 192.168.88.203 as ESPT3_3 (c1, k10, u'openhabian').
What is the set timeout on the broker?
From what I understand it, the timeout is set by the client during the connect. And should be 10 seconds (k10 in the connection string: see log)
In April/May it was 15 seconds, but it has been reduced to 10 (I don't know why)
In version mega-20180330 it was 15 seconds.
No, ESPeasy is the client, I mean on the broker side. (Mosquito for example) I set it to a lower value, because Mosquito default settings are 10 seconds and that may cause a lot of disconnects when ESPeasy is using 15 sec.
In the config file there is nothing set. So I assume it's using the defaults.
But reading in the net my understanding is that is the client who defines the timeout:
You do not configure the keep alive on the broker, it is configured on the client side.
The value is pass in the connect packet from the client to the broker (http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Keep_Alive)
How you configure this will depend on which client library you are using, but most libraries take it as a configuration option.
E.g. for libmosquitto you pass the keep alive value in seconds to the mosquitto_connect function (https://mosquitto.org/man/libmosquitto-3.html#idm46181896216640)
int mosquitto_connect( mosq,
host,
port,
keepalive);
struct mosquitto *mosq;
const char *host;
int port;
Also you normally will not have to publish a message, the client library should send ping packets if no messages have been sent/received in the keep alive period in order to keep the connection alive. int keepalive;
_
The Keep Alive is a time interval measured in seconds. Expressed as a 16-bit word, it is the maximum time interval that is permitted to elapse between the point at which the Client finishes transmitting one Control Packet and the point it starts sending the next. ###It is the responsibility of the Client to ensure that the interval between Control Packets being sent does not exceed the Keep Alive value. In the absence of sending any other Control Packets, the Client MUST send a PINGREQ Packet [MQTT-3.1.2-23].
The Client can send PINGREQ at any time, irrespective of the Keep Alive value, and use the PINGRESP to determine that the network and the Server are working.
In other words: we can set any value, but the client must ensure to send a Control packet within that interval.
more tests with a lot of hooks and log analysis:
I have identified two cases that happen on two units:
1) the unit disconnects and reconnects almost immediately to MQTT. can happen 10 times in 10 minutes in a row and then for 2 or 3 hours it doesn't happen. Wifi does not disconnect
2) the unit disconnects from MQTT and does not try to reconnect (according to the mosquitto.log). In this case wifi disconnects. This second case is the case where the unit "internally" works with its rules but cannot communicate with the outside. In this case if I toggle the wifi on the router the unit reboots with an exception.
Router is mikrotik. Unit 1 has the following plugins: switch, pcf8574, sysinfo, dummy Unit 2, has the following plugins: 2x pcf8574, sysinfo, dummy
Both have long wire connections (15 meters of wire) from the unit itself to the physical buttons inside the walls. But when they freeze there is no one touching the buttons and there is no activity at all. Also because the units are at present in a holiday home where there is no one living (they control the automatic lights and heating).
Other possible ideas: interference coming from the wires... incompatibility with router...
Any other thoughts?
I just added this commit: https://github.com/letscontrolit/ESPEasy/pull/1669/commits/8350e09f5d4d6768f80f8b2b64aab4256eab7a33 I made it part of this PR: https://github.com/letscontrolit/ESPEasy/pull/1669 So maybe you could either run the code of this commit in your build, or try that PR. It may give a bit more insight in what's happening with MQTT.
you mean loading the firmware PR1669_test_ESP8266_4096_VCC.bin in my unit?
but how can I read the log when the unit disconnects? Web server does not work and syslog neither...
No it was a new commit, I just finished at the moment you posted that. So you either have to build it, or wait until I made a build for it.
ok, I can build it.
no, I haven't been able to build it. Will wait for your build and test it.
I just tested a build script on my machine, so don't pay attention to the strange filenames. https://www.dropbox.com/s/dlvj7f1x3wlos7u/PR1669_test_ESP8266_4096_VCC-g8350e09.zip?dl=0 This is based on the current state of the PR, with this as last commit:
commit 8350e09f5d4d6768f80f8b2b64aab4256eab7a33 Author: TD-er gijs.noorlander@gmail.com Date: Thu Aug 30 23:17:18 2018 +0200
[MQTT] Give MQTT state in log when connect state changes.
Hi, just flashed my unit 20 minutes ago. Disconnection already happened twice 1111863 and 1203752. Also "ConnectFailures" are increasing:
27078: Command: taskrun
27081: Dummy: value 1: 1.00
27081: Dummy: value 2: 1.00
27081: Dummy: value 3: 0.00
27081: Dummy: value 4: 0.00
27084: EVENT: Relay1#r1=1.00
27197: EVENT: Relay1#r2=1.00
27310: EVENT: Relay1#=0.00
27422: EVENT: Relay1#=0.00
40584: WD : Uptime 1 ConnectFailures 0 FreeMem 20888
49942: EVENT: Clock#Time=Fri,09:32
70584: WD : Uptime 1 ConnectFailures 0 FreeMem 20768
105709: WD : Uptime 2 ConnectFailures 0 FreeMem 20168
109709: EVENT: Clock#Time=Fri,09:33
130584: WD : Uptime 2 ConnectFailures 0 FreeMem 19272
160585: WD : Uptime 3 ConnectFailures 0 FreeMem 19024
170012: EVENT: Clock#Time=Fri,09:34
190585: WD : Uptime 3 ConnectFailures 0 FreeMem 19328
220584: WD : Uptime 4 ConnectFailures 0 FreeMem 20840
230012: EVENT: Clock#Time=Fri,09:35
252840: WD : Uptime 4 ConnectFailures 0 FreeMem 17984
280589: WD : Uptime 5 ConnectFailures 0 FreeMem 20840
289591: EVENT: Clock#Time=Fri,09:36
310586: WD : Uptime 5 ConnectFailures 0 FreeMem 19016
340586: WD : Uptime 6 ConnectFailures 0 FreeMem 20528
349590: EVENT: Clock#Time=Fri,09:37
370590: WD : Uptime 6 ConnectFailures 0 FreeMem 18960
400587: WD : Uptime 7 ConnectFailures 0 FreeMem 19272
409590: EVENT: Clock#Time=Fri,09:38
430587: WD : Uptime 7 ConnectFailures 0 FreeMem 20840
460585: WD : Uptime 8 ConnectFailures 0 FreeMem 20528
469590: EVENT: Clock#Time=Fri,09:39
490588: WD : Uptime 8 ConnectFailures 0 FreeMem 20528
520585: WD : Uptime 9 ConnectFailures 0 FreeMem 20840
529592: EVENT: Clock#Time=Fri,09:40
550585: WD : Uptime 9 ConnectFailures 0 FreeMem 20840
580586: WD : Uptime 10 ConnectFailures 0 FreeMem 20840
589590: EVENT: Clock#Time=Fri,09:41
610588: WD : Uptime 10 ConnectFailures 0 FreeMem 20840
614607: BMP280 : Address: 0x76
614608: BMP280 : Temperature: 32.84
614608: BMP280 : Barometric Pressure: 1012.02
614610: EVENT: bme280#Temperature=32.84
614719: EVENT: bme280#Humidity=0.00
614826: EVENT: bme280#Pressure=1012.02
614974: Dummy: value 1: 1.00
614974: Dummy: value 2: 1.00
614975: Dummy: value 3: 0.00
614975: Dummy: value 4: 0.00
614977: EVENT: Relay1#r1=1.00
615089: EVENT: Relay1#r2=1.00
615200: EVENT: Relay1#=0.00
615314: EVENT: Relay1#=0.00
640585: WD : Uptime 11 ConnectFailures 0 FreeMem 20840
649590: EVENT: Clock#Time=Fri,09:42
670584: WD : Uptime 11 ConnectFailures 0 FreeMem 20840
700584: WD : Uptime 12 ConnectFailures 0 FreeMem 19272
709590: EVENT: Clock#Time=Fri,09:43
730584: WD : Uptime 12 ConnectFailures 0 FreeMem 19272
760584: WD : Uptime 13 ConnectFailures 0 FreeMem 20840
769590: EVENT: Clock#Time=Fri,09:44
790584: WD : Uptime 13 ConnectFailures 0 FreeMem 20840
820586: WD : Uptime 14 ConnectFailures 0 FreeMem 19944
829590: EVENT: Clock#Time=Fri,09:45
850584: WD : Uptime 14 ConnectFailures 0 FreeMem 19272
880584: WD : Uptime 15 ConnectFailures 0 FreeMem 20824
889840: EVENT: Clock#Time=Fri,09:46
910762: WD : Uptime 15 ConnectFailures 0 FreeMem 17968
940584: WD : Uptime 16 ConnectFailures 0 FreeMem 20824
949760: EVENT: Clock#Time=Fri,09:47
970585: WD : Uptime 16 ConnectFailures 0 FreeMem 19704
1000584: WD : Uptime 17 ConnectFailures 0 FreeMem 17776
1009932: EVENT: Clock#Time=Fri,09:48
1014950: SYS : 17.00
1014954: EVENT: SysInfo#UptimeDays=0.01
1030584: WD : Uptime 17 ConnectFailures 0 FreeMem 19088
1060584: WD : Uptime 18 ConnectFailures 0 FreeMem 19312
1069931: EVENT: Clock#Time=Fri,09:49
1090584: WD : Uptime 18 ConnectFailures 0 FreeMem 19704
1111863: MQTT : Connection lost, state: Connection lost
1111864: EVENT: MQTT#Disconnected
1111888: ACT : timerSet,1,300
1114831: MQTT : Connected to broker with client ID: ESPT3_3
1114834: Subscribed to: /ESPT3/#
1114835: EVENT: MQTT#Connected
1114854: ACT : publish /Alarm/ESPT3/MQTTstatus,Connected
1114871: ACT : timerSet,1,0
1114884: ACT : timerSet,2,1
1115019: Command: timerset
1115020: Command: publish
1115026: Command: timerset
1115027: Command: timerset
1117010: EVENT: Rules#Timer=2
1117041: ACT : taskrun,1
1117152: Command: taskrun
1117156: Dummy: value 1: 1.00
1117157: Dummy: value 2: 1.00
1117157: Dummy: value 3: 0.00
1117157: Dummy: value 4: 0.00
1117160: EVENT: Relay1#r1=1.00
1117272: EVENT: Relay1#r2=1.00
1117383: EVENT: Relay1#=0.00
1117493: EVENT: Relay1#=0.00
1120584: WD : Uptime 19 ConnectFailures 2 FreeMem 20824
1130010: EVENT: Clock#Time=Fri,09:50
1150585: WD : Uptime 19 ConnectFailures 2 FreeMem 19312
1180586: WD : Uptime 20 ConnectFailures 2 FreeMem 20824
1190011: EVENT: Clock#Time=Fri,09:51
1203752: MQTT : Connection lost, state: Connection lost
1203753: EVENT: MQTT#Disconnected
1203776: ACT : timerSet,1,300
1208909: MQTT : Failed to connect to broker
1208944: Command: timerset
1214161: MQTT : Failed to connect to broker
1214192: WD : Uptime 20 ConnectFailures 6 FreeMem 19256
1214878: Dummy: value 1: 1.00
1214879: Dummy: value 2: 1.00
1214879: Dummy: value 3: 0.00
1214879: Dummy: value 4: 0.00
1214882: EVENT: Relay1#r1=1.00
1214998: EVENT: Relay1#r2=1.00
1215112: EVENT: Relay1#=0.00
1215224: EVENT: Relay1#=0.00
1222251: Command: publish
1222252: Command: timerset
1222253: Command: timerset
1224247: EVENT: Rules#Timer=2
1224279: ACT : taskrun,1
1224393: Command: taskrun
1224398: Dummy: value 1: 1.00
1224399: Dummy: value 2: 1.00
1224399: Dummy: value 3: 0.00
1224399: Dummy: value 4: 0.00
1224402: EVENT: Relay1#r1=1.00
1224521: EVENT: Relay1#r2=1.00
1224638: EVENT: Relay1#=0.00
1224754: EVENT: Relay1#=0.00
1240584: WD : Uptime 21 ConnectFailures 8 FreeMem 20288
One thing I cannot understand: during the disconnection window I cannot open a new page of the webserver for the unit, but the log keeps updating...
Unit after few minutes crashed with Hardware Watchdog:
27078: Command: taskrun
27081: Dummy: value 1: 1.00
27081: Dummy: value 2: 1.00
27081: Dummy: value 3: 0.00
27081: Dummy: value 4: 0.00
27084: EVENT: Relay1#r1=1.00
27197: EVENT: Relay1#r2=1.00
27310: EVENT: Relay1#=0.00
27422: EVENT: Relay1#=0.00
40584: WD : Uptime 1 ConnectFailures 0 FreeMem 20888
49942: EVENT: Clock#Time=Fri,09:32
70584: WD : Uptime 1 ConnectFailures 0 FreeMem 20768
105709: WD : Uptime 2 ConnectFailures 0 FreeMem 20168
109709: EVENT: Clock#Time=Fri,09:33
130584: WD : Uptime 2 ConnectFailures 0 FreeMem 19272
160585: WD : Uptime 3 ConnectFailures 0 FreeMem 19024
170012: EVENT: Clock#Time=Fri,09:34
190585: WD : Uptime 3 ConnectFailures 0 FreeMem 19328
220584: WD : Uptime 4 ConnectFailures 0 FreeMem 20840
230012: EVENT: Clock#Time=Fri,09:35
252840: WD : Uptime 4 ConnectFailures 0 FreeMem 17984
280589: WD : Uptime 5 ConnectFailures 0 FreeMem 20840
289591: EVENT: Clock#Time=Fri,09:36
310586: WD : Uptime 5 ConnectFailures 0 FreeMem 19016
340586: WD : Uptime 6 ConnectFailures 0 FreeMem 20528
349590: EVENT: Clock#Time=Fri,09:37
370590: WD : Uptime 6 ConnectFailures 0 FreeMem 18960
400587: WD : Uptime 7 ConnectFailures 0 FreeMem 19272
409590: EVENT: Clock#Time=Fri,09:38
430587: WD : Uptime 7 ConnectFailures 0 FreeMem 20840
460585: WD : Uptime 8 ConnectFailures 0 FreeMem 20528
469590: EVENT: Clock#Time=Fri,09:39
490588: WD : Uptime 8 ConnectFailures 0 FreeMem 20528
520585: WD : Uptime 9 ConnectFailures 0 FreeMem 20840
529592: EVENT: Clock#Time=Fri,09:40
550585: WD : Uptime 9 ConnectFailures 0 FreeMem 20840
580586: WD : Uptime 10 ConnectFailures 0 FreeMem 20840
589590: EVENT: Clock#Time=Fri,09:41
610588: WD : Uptime 10 ConnectFailures 0 FreeMem 20840
614607: BMP280 : Address: 0x76
614608: BMP280 : Temperature: 32.84
614608: BMP280 : Barometric Pressure: 1012.02
614610: EVENT: bme280#Temperature=32.84
614719: EVENT: bme280#Humidity=0.00
614826: EVENT: bme280#Pressure=1012.02
614974: Dummy: value 1: 1.00
614974: Dummy: value 2: 1.00
614975: Dummy: value 3: 0.00
614975: Dummy: value 4: 0.00
614977: EVENT: Relay1#r1=1.00
615089: EVENT: Relay1#r2=1.00
615200: EVENT: Relay1#=0.00
615314: EVENT: Relay1#=0.00
640585: WD : Uptime 11 ConnectFailures 0 FreeMem 20840
649590: EVENT: Clock#Time=Fri,09:42
670584: WD : Uptime 11 ConnectFailures 0 FreeMem 20840
700584: WD : Uptime 12 ConnectFailures 0 FreeMem 19272
709590: EVENT: Clock#Time=Fri,09:43
730584: WD : Uptime 12 ConnectFailures 0 FreeMem 19272
760584: WD : Uptime 13 ConnectFailures 0 FreeMem 20840
769590: EVENT: Clock#Time=Fri,09:44
790584: WD : Uptime 13 ConnectFailures 0 FreeMem 20840
820586: WD : Uptime 14 ConnectFailures 0 FreeMem 19944
829590: EVENT: Clock#Time=Fri,09:45
850584: WD : Uptime 14 ConnectFailures 0 FreeMem 19272
880584: WD : Uptime 15 ConnectFailures 0 FreeMem 20824
889840: EVENT: Clock#Time=Fri,09:46
910762: WD : Uptime 15 ConnectFailures 0 FreeMem 17968
940584: WD : Uptime 16 ConnectFailures 0 FreeMem 20824
949760: EVENT: Clock#Time=Fri,09:47
970585: WD : Uptime 16 ConnectFailures 0 FreeMem 19704
1000584: WD : Uptime 17 ConnectFailures 0 FreeMem 17776
1009932: EVENT: Clock#Time=Fri,09:48
1014950: SYS : 17.00
1014954: EVENT: SysInfo#UptimeDays=0.01
1030584: WD : Uptime 17 ConnectFailures 0 FreeMem 19088
1060584: WD : Uptime 18 ConnectFailures 0 FreeMem 19312
1069931: EVENT: Clock#Time=Fri,09:49
1090584: WD : Uptime 18 ConnectFailures 0 FreeMem 19704
1111863: MQTT : Connection lost, state: Connection lost
1111864: EVENT: MQTT#Disconnected
1111888: ACT : timerSet,1,300
1114831: MQTT : Connected to broker with client ID: ESPT3_3
1114834: Subscribed to: /ESPT3/#
1114835: EVENT: MQTT#Connected
1114854: ACT : publish /Alarm/ESPT3/MQTTstatus,Connected
1114871: ACT : timerSet,1,0
1114884: ACT : timerSet,2,1
1115019: Command: timerset
1115020: Command: publish
1115026: Command: timerset
1115027: Command: timerset
1117010: EVENT: Rules#Timer=2
1117041: ACT : taskrun,1
1117152: Command: taskrun
1117156: Dummy: value 1: 1.00
1117157: Dummy: value 2: 1.00
1117157: Dummy: value 3: 0.00
1117157: Dummy: value 4: 0.00
1117160: EVENT: Relay1#r1=1.00
1117272: EVENT: Relay1#r2=1.00
1117383: EVENT: Relay1#=0.00
1117493: EVENT: Relay1#=0.00
1120584: WD : Uptime 19 ConnectFailures 2 FreeMem 20824
1130010: EVENT: Clock#Time=Fri,09:50
1150585: WD : Uptime 19 ConnectFailures 2 FreeMem 19312
1180586: WD : Uptime 20 ConnectFailures 2 FreeMem 20824
1190011: EVENT: Clock#Time=Fri,09:51
1203752: MQTT : Connection lost, state: Connection lost
1203753: EVENT: MQTT#Disconnected
1203776: ACT : timerSet,1,300
1208909: MQTT : Failed to connect to broker
1208944: Command: timerset
1214161: MQTT : Failed to connect to broker
1214192: WD : Uptime 20 ConnectFailures 6 FreeMem 19256
1214878: Dummy: value 1: 1.00
1214879: Dummy: value 2: 1.00
1214879: Dummy: value 3: 0.00
1214879: Dummy: value 4: 0.00
1214882: EVENT: Relay1#r1=1.00
1214998: EVENT: Relay1#r2=1.00
1215112: EVENT: Relay1#=0.00
1215224: EVENT: Relay1#=0.00
1222251: Command: publish
1222252: Command: timerset
1222253: Command: timerset
1224247: EVENT: Rules#Timer=2
1224279: ACT : taskrun,1
1224393: Command: taskrun
1224398: Dummy: value 1: 1.00
1224399: Dummy: value 2: 1.00
1224399: Dummy: value 3: 0.00
1224399: Dummy: value 4: 0.00
1224402: EVENT: Relay1#r1=1.00
1224521: EVENT: Relay1#r2=1.00
1224638: EVENT: Relay1#=0.00
1224754: EVENT: Relay1#=0.00
1240584: WD : Uptime 21 ConnectFailures 8 FreeMem 20288
1264462: Command: publish
1264465: Command: timerset
1264469: Command: timerset
1266461: EVENT: Rules#Timer=2
1266491: ACT : taskrun,1
1266602: Command: taskrun
1266607: Dummy: value 1: 1.00
1266607: Dummy: value 2: 1.00
1266608: Dummy: value 3: 0.00
1266608: Dummy: value 4: 0.00
1266610: EVENT: Relay1#r1=1.00
1266729: EVENT: Relay1#r2=1.00
1266843: EVENT: Relay1#=0.00
1266958: EVENT: Relay1#=0.00
1270586: WD : Uptime 21 ConnectFailures 12 FreeMem 20824
1300584: WD : Uptime 22 ConnectFailures 12 FreeMem 20824
1309260: EVENT: Clock#Time=Fri,09:53
1330584: WD : Uptime 22 ConnectFailures 12 FreeMem 20824
1360597: WD : Uptime 23 ConnectFailures 12 FreeMem 19256
1369260: EVENT: Clock#Time=Fri,09:54
1390585: WD : Uptime 23 ConnectFailures 12 FreeMem 20600
1420584: WD : Uptime 24 ConnectFailures 12 FreeMem 19928
1429260: EVENT: Clock#Time=Fri,09:55
1450584: WD : Uptime 24 ConnectFailures 12 FreeMem 19256
1480586: WD : Uptime 25 ConnectFailures 12 FreeMem 20824
1489260: EVENT: Clock#Time=Fri,09:56
1510584: WD : Uptime 25 ConnectFailures 12 FreeMem 20600
1540586: WD : Uptime 26 ConnectFailures 12 FreeMem 20824
1549260: EVENT: Clock#Time=Fri,09:57
1570586: WD : Uptime 26 ConnectFailures 12 FreeMem 20512
1600585: WD : Uptime 27 ConnectFailures 12 FreeMem 20600
1609260: EVENT: Clock#Time=Fri,09:58
1630584: WD : Uptime 27 ConnectFailures 12 FreeMem 20824
1660587: WD : Uptime 28 ConnectFailures 12 FreeMem 19928
1669260: EVENT: Clock#Time=Fri,09:59
1690584: WD : Uptime 28 ConnectFailures 12 FreeMem 20824
1720587: WD : Uptime 29 ConnectFailures 12 FreeMem 20824
1729260: EVENT: Clock#Time=Fri,10:00
1750585: WD : Uptime 29 ConnectFailures 12 FreeMem 20824
5140: EVENT: WiFi#Connected
5159: ACT : publish /Alarm/ESPT3/WiFistatus,Connected
5257: Webserver: start
5675: Current Time Zone: DST time start: 2018-03-25 02:00:00 offset: 120 minSTD time start: 2018-10-28 03:00:00 offset: 60 min
5678: EVENT: Time#Initialized
5720: ACT : taskvalueset 11,1,1
5737: ACT : timerSet,5,10
5843: MQTT : Intentional reconnect
5874: MQTT : Connected to broker with client ID: ESPT3_3
5876: Subscribed to: /ESPT3/#
5878: EVENT: MQTT#Connected
5904: ACT : publish /Alarm/ESPT3/MQTTstatus,Connected
5923: ACT : timerSet,1,0
5941: ACT : timerSet,2,1
6306: Command: timerset
6306: Command: timerset
6378: Dummy: value 1: 1.00
6378: Dummy: value 2: 1.00
6378: Dummy: value 3: 0.00
6379: Dummy: value 4: 0.00
6381: EVENT: Relay1#r1=1.00
6501: EVENT: Relay1#r2=1.00
6617: EVENT: Relay1#=0.00
6733: EVENT: Relay1#=0.00
6891: SYS : 0.00
6895: EVENT: SysInfo#UptimeDays=0.00
7046: SW : State 0.00
7049: EVENT: p1#Switch=0.00
7311: SW : State 0.00
7313: EVENT: p2#Switch=0.00
7472: Dummy: value 1: 1.00
7472: Dummy: value 2: 1.00
7473: Dummy: value 3: 0.00
7473: Dummy: value 4: 0.00
7476: EVENT: config#ntpconnected=1.00
7599: EVENT: config#auto=1.00
7719: EVENT: config#rebootWD=0.00
7840: EVENT: config#aftertensec=0.00
7975: EVENT: Rules#Timer=2
8009: ACT : taskrun,1
8236: Command: taskrun
8242: Dummy: value 1: 1.00
8242: Dummy: value 2: 1.00
8242: Dummy: value 3: 0.00
8242: Dummy: value 4: 0.00
8245: EVENT: Relay1#r1=1.00
8371: EVENT: Relay1#r2=1.00
8494: EVENT: Relay1#=0.00
8613: EVENT: Relay1#=0.00
12818: EVENT: Rules#Timer=4
12851: ACT : taskvalueset,11,4,1
12958: Command: taskvalueset
17271: ACT : PCFGPIO,67,1
17278: PCF : GPIO 67 Set to 1
17281: ACT : taskvalueset 1,1,1
17295: ACT : timerSet,2,1
17341: Command: event
17342: EVENT: R2off
17449: ACT : PCFGPIO,68,1
17457: PCF : GPIO 68 Set to 1
17460: ACT : taskvalueset 1,2,1
17474: ACT : timerSet,2,1
17515: Command: taskvalueset
17516: Command: timerset
17517: Command: taskvalueset
17518: Command: timerset
18818: EVENT: Rules#Timer=2
18852: ACT : taskrun,1
18972: Command: taskrun
18975: Dummy: value 1: 1.00
18975: Dummy: value 2: 1.00
18975: Dummy: value 3: 0.00
18975: Dummy: value 4: 0.00
18978: EVENT: Relay1#r1=1.00
19096: EVENT: Relay1#r2=1.00
19213: EVENT: Relay1#=0.00
19330: EVENT: Relay1#=0.00
28817: EVENT: Clock#Time=Fri,10:01
32370: WD : Uptime 1 ConnectFailures 0 FreeMem 19288
62370: WD : Uptime 1 ConnectFailures 0 FreeMem 20856
88817: EVENT: Clock#Time=Fri,10:02
92370: WD : Uptime 2 ConnectFailures 0 FreeMem 20632
122370: WD : Uptime 2 ConnectFailures 0 FreeMem 20856
148817: EVENT: Clock#Time=Fri,10:03
152370: WD : Uptime 3 ConnectFailures 0 FreeMem 18976
182372: WD : Uptime 3 ConnectFailures 0 FreeMem 20776
208817: EVENT: Clock#Time=Fri,10:04
212370: WD : Uptime 4 ConnectFailures 0 FreeMem 20856
242372: WD : Uptime 4 ConnectFailures 0 FreeMem 20856
268817: EVENT: Clock#Time=Fri,10:05
272370: WD : Uptime 5 ConnectFailures 0 FreeMem 20856
302372: WD : Uptime 5 ConnectFailures 0 FreeMem 20856
328817: EVENT: Clock#Time=Fri,10:06
332370: WD : Uptime 6 ConnectFailures 0 FreeMem 19960
362370: WD : Uptime 6 ConnectFailures 0 FreeMem 20856
388818: EVENT: Clock#Time=Fri,10:07
392370: WD : Uptime 7 ConnectFailures 0 FreeMem 20856
422370: WD : Uptime 7 ConnectFailures 0 FreeMem 19736
448817: EVENT: Clock#Time=Fri,10:08
452370: WD : Uptime 8 ConnectFailures 0 FreeMem 20856
482372: WD : Uptime 8 ConnectFailures 0 FreeMem 20856
508817: EVENT: Clock#Time=Fri,10:09
512370: WD : Uptime 9 ConnectFailures 0 FreeMem 20856
This is the result of a second unit. Different behaviour (case 2 of my earlier email):
this unit at 10:35 stops responding and disconnects from MQTT and Wifi. But the internal rules still work because my internal watchdog make it reboot after 5 minutes of disconnection.
See log:
41913887: EVENT: Clock#Time=Fri,10:33
41942677: WD : Uptime 699 ConnectFailures 0 FreeMem 21128
41972677: WD : Uptime 700 ConnectFailures 0 FreeMem 21352
41973889: EVENT: Clock#Time=Fri,10:34
42002679: WD : Uptime 700 ConnectFailures 0 FreeMem 21352
42010405: Dummy: value 1: 1.00
42010405: Dummy: value 2: 1.00
42010406: Dummy: value 3: 1.00
42010406: Dummy: value 4: 1.00
42010409: EVENT: Relay1#r1=1.00
42011296: EVENT: Relay1#r2=1.00
42012183: EVENT: Relay1#r3=1.00
42013070: EVENT: Relay1#r4=1.00
42014026: Dummy: value 1: 1.00
42014027: Dummy: value 2: 1.00
42014027: Dummy: value 3: 1.00
42014027: Dummy: value 4: 1.00
42014030: EVENT: Relay2#r5=1.00
42014917: EVENT: Relay2#r6=1.00
42015804: EVENT: Relay2#r7=1.00
42016692: EVENT: Relay2#r8=1.00
42017617: SYS : 700.00
42017621: EVENT: SysInfo#UptimeDays=0.49
42032672: WD : Uptime 701 ConnectFailures 0 FreeMem 21352
42033724: EVENT: Clock#Time=Fri,10:35
42062672: WD : Uptime 701 ConnectFailures 0 FreeMem 21128
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
>> NetworkError when attempting to fetch resource. <<
7006: WIFI : Connected! AP: Varazze-2G (CC:2D:E0:2B:7E:D6) Ch: 7 Duration: 4448 ms
7006: EVENT: WiFi#ChangedAccesspoint
7882: WIFI : DHCP IP: 192.168.88.202 (ESPT2-2) GW: 192.168.88.1 SN: 255.255.255.0 duration: 51 ms
7894: EVENT: WiFi#Connected
7992: ACT : publish /Alarm/ESPT2/WiFistatus,Connected
8773: Webserver: start
8785: Dummy: value 1: 0.00
8785: Dummy: value 2: 0.00
8785: Dummy: value 3: 0.00
8785: Dummy: value 4: 0.00
8790: EVENT: Relay2#r5=0.00
9657: EVENT: Relay2#r6=0.00
10528: EVENT: Relay2#r7=0.00
11398: EVENT: Relay2#r8=0.00
12593: SYS : 0.00
12598: EVENT: SysInfo#UptimeDays=0.00
Adding more info: in the router's log there is no sign of disconnection. So it's connected but it's not reacting.
Hi Gijs, I am testing a different router. Swapped the Mikrotik for a Linksys AC1200. The disconnections seems to have disappeared. So maybe it's a protocol related problem. How can it be analyzed further? Mikrotik is a leader in router OS, their routers are used in datacenters by telecomm operators...
Now I see reboots due to hardware watchdog, but that's another problem.
During the last 2 days, I got no more disconnections. It's definitely a network issue between the router and ESP. How can this be traced further?
Maybe the beacon interval is not constant on the Mikrotik? Is there some setting regarding the beacon interval in that accesspoint?
I run Mikrotiks at home and I have not noticed any connection issues, but I really have not tweak any of the billzion options in the WiFi sections, let alone the rest of the router (https://wiki.mikrotik.com/wiki/Manual:Interface/Wireless)
I would vote that it's not the Mikrotik per say, but if you have changed any of the options, that could also be a possible problem.
@LeeNX : Since I swapped the Mikrotik with Linksys all problems have disappeared. AFAIK I haven't changed the default settings. Reading into mikrotik forums, there seems to be issues with power savings devices connecting to Mikrotik.
@TD-er : beacon interval and DTIM are not configurable on Mikrotik routers. On the Linksys, the values are: beacon interval = 100ms and DTIM=2 What should be the DTIM and beacon interval values compatible with ESP8266? Also, as far a you know, is ESP8266 using some power saving scheme ? In case is it possible to change it's settings?
Hi, @LeeNX, which controller do you use (MQTT, Openhab, Domoticz, etc.)? And which plugins?
I use MQTT Openhab (V2.3).
Yes, the ESP8266 is using some power mode, which makes the consistency of the beacon interval more critical. I can try to add some checkbox to disable this power save mode.
@TD-er : good idea. it would be interesting to see if disabling the power saving mode some disconnecting issues or hardware watchdog go away.
@giig1967g - Using OpenHAB MQTT controller using data to mosquito , for an I2C BMP280 sensor/plugin. Only reporting temp/presure, WiFI RSSI, uptime and System load for a NodeMCU using ESP_Easy_mega-20180714_normal_ESP8266_4096.bin
Pretty old and stable. Have not noticed any data drop out from the weather app for pressure.
As a note, my WiFi RSSI is -64db connected for 8days by the system info.
Not sure if any of that helps?
@LeeNX : please find attached my mikrotik configuration for wireless and DHCS server. Can you share yours so I can check if there are any differences? I will try to setup a unit with only a BMP sensor (and without switch and pcf8574 plugins) in order to match your setup. miktorik.pdf miktorik2.pdf
Hi all, I am testing letest versions (august versions) and found the following problems: after a certain period of time (could be few hours or 3 days) the unit is not accessible through wifi, doesn't respond to ping and doesn't send MQTT updates, but local switches work and also rules work.
I have the same isue, router Thomson from UPC cable TV, ESP8266 with 4MB
btw. For a year, it was Tasmota software on this module, there were no problems with wifi
Same problem as I was having. we suspect it's a power saving feature. As soon as a version with power saving disabled will be ready we will test to see if this is the reason.
Hi all, also with latest versions (20181002) I am experiencing this problem when connected to a Mikrotik router. After some time (this time were 10 hours) the unit didn't respond and froze but was rules and buttons were still working. And as soon as I reboot the router the unit rebooted too and was accessible. Exactly same symptoms and diagnose as before.
@TD-er: is it feasible to add the power saving settings to see if this solves the issue?
@giig1967g I already planned to move WiFi Man to the main repo, but I couldn't find an issue for it. So I just made #1859 I plan to add it in the next days/week.
excellent thanks!
Hi all, I am testing letest versions (august versions) and found the following problems: after a certain period of time (could be few hours or 3 days) the unit is not accessible through wifi, doesn't respond to ping and doesn't send MQTT updates, but local switches work and also rules work.
In other words, it seems to be working "locally" but cannot communicate with the rest of the world. Also it's own internal AP is not transmitting, so seems to be stuck in some sort of wifi connection loop. Serial log is disabled (I have seen reports of problems leaving it active).
One thing that could help finding the issue is that if I reboot the router, the unit reboots.
My router is a Mikrotik. Nothing in the router log about any disconnection. I am using ESP8266, 4M, static IP, OpenHAB controller, switch plugin and dummy plugin. Nothing else.
One question: what is "Client IP" in the info page? It differs from the static IP that I set and is something that I have not defined (see screenshot)
thanks