letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.26k stars 2.21k forks source link

Reset via serial or Command via Tools - Broken #1351

Closed Oxyandy closed 6 years ago

Oxyandy commented 6 years ago

Summarize of the problem/feature request

ESP_Easy_mega-20180502_normal_ESP8266_1024.bin Reset via serial or Command (on tools page) does nothing

Expected behavior

Reset device to default

TD-er commented 6 years ago

I've noticed that also yesterday. Not one command worked. On my node it was during continuous reconnect to wifi, so I thought that may caused it. Was your node also busy, or just regular activity?

Oxyandy commented 6 years ago

Not busy at all, I have had several crashes & reboots with latest firmware :( Didn't want to say anything.. but there said.. When seemly idle, no rules, no timers.. just a switch, controller & LED and me doing other things... Don't ask about it, it will come once I have time to inspect it carefully

workgroupengineering commented 6 years ago

I think I have the same issue. All command and rule as ignored. The latest firmware not affected by this issue is mega-20180501

TD-er commented 6 years ago

OK, then we'll see this evening what was merged yesterday that may have broken these rules and commands.

Oxyandy commented 6 years ago

I am sure this "Issue" as above is an easy fix & as I closed the other one, and this relates to 0502 thought I'd dump this snippet here - Seeing a few 'recoverable' LmacRxBlk:1 errors.. never seen these before 0502. Note: a few too many EVENT: MQTT#Disconnected showing too, not used to seeing those either.

00:48:27: 751020 : WD   : Uptime 13 ConnectFailures 0 FreeMem 17120
00:48:57: 781020 : WD   : Uptime 13 ConnectFailures 0 FreeMem 17120
00:49:03: 787279 : EVENT: Clock#Time=Thu,00:49
787281 : EVENT: Clock#Time=Thu,00:49 Processing time:3 milliSeconds
00:49:27: 811020 : WD   : Uptime 14 ConnectFailures 0 FreeMem 15520
00:49:34: 818582 : MQTT : Connection lost
818582 : EVENT: MQTT#Disconnected
818584 : EVENT: MQTT#Disconnected Processing time:2 milliSeconds
LmacRxBlk:1
00:49:36: LmacRxBlk:1
LmacRxBlk:1
00:49:38: LmacRxBlk:1
823595 : MQTT : Failed to connect to broker
00:49:39: 823640 : MQTT : Connected to broker with client ID: ESPClient_5C:CF:7F:72:96:EC
823641 : Subscribed to: domoticz/out
823642 : EVENT: MQTT#Connected
823644 : EVENT: MQTT#Connected Processing time:2 milliSeconds
00:49:57: 841020 : WD   : Uptime 14 ConnectFailures 4 FreeMem 17064
workgroupengineering commented 6 years ago

I try revert this commits: [LWIP 1.4.0] Change to LWIP 1.4.0, to allow large http POST messages [LWIP] fix ESP32 build the command are back to work properly except reboot, while for the rules I have a strange behavior, ignore the elseif statement.

ghtester commented 6 years ago

I have the same issue but some rule commands works (TSOP4838 events), commands based on timer does not work.

BTW. when I turn on / reboot the device, I see a boot loop due to wdt until a stable WiFi connection is finally established (usually many attempts and as I have set a beep on boot, it's a bit annoying :-) ):

ets Jan 8 2013,rst cause:4, boot mode:(3,6)

wdt reset load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d v614f7c32

Is that expected behavior or an issue please? To be tested on another device, this one has 2 APs configured and the primary AP is turned off.

Oxyandy commented 6 years ago

I've been getting various reboots while Node has been idle, included a few log lines before the reboot... NTP related ???

rst cause:4, boot mode:(3,7) wdt reset load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d v614f7c32 ~ld ªU102 :

22:25:14: LmacRxBlk:1
ping 1, timeout 1, total payload 0 bytes, 999 ms
LmacRxBlk:1
22:25:15: ping 1, timeout 1, total payload 0 bytes, 949 ms
LmacRxBlk:1
ping 1, timeout 1, total payload 0 bytes, 949 ms
22:25:16: 18568419 : Host unreachable: 0.0.0.0
18568420 : Wifi  : Detected strange behavior, reconnect wifi.
state: 5 -> 0 (0)
rm 0
18568421 : WIFI  : Arduino wifi status: WL_IDLE_STATUS ESPeasy internal wifi status: ESPEASY_WIFI_SERVICES_INITIALIZED
18568429 : NTP  : NTP host au.pool.ntp.org (0.0.0.0) unreachable
18568436 : EVENT: Clock#Time=Wed,22:25
18568456 : EVENT: WiFi#Disconnected
18568458 : WIFI : Disconnected! Reason: '(8) Assoc leave' Connected for 14 s
18568467 : WIFI : Connecting MAD_IOT attempt #0
18568468 : IP   : Static IP : 192.168.0.225 GW: 192.168.0.254 SN: 255.255.255.0 DNS: 8.8.8.8
18568640 : WD   : Uptime 309 ConnectFailures 4 FreeMem 8912
scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 1
cnt 

connected with MAD_IOT, channel 13
ip:192.168.0.225,mask:255.255.255.0,gw:192.168.0.254
18569277 : WIFI : Connected! AP: MAD_IOT (F4:F2:6D:25:84:C6) Ch: 13 Duration: 807 ms
18569279 : IP   : Static IP : 192.168.0.225 GW: 192.168.0.254 SN: 255.255.255.0 DNS: 8.8.8.8
18569288 : WIFI : Static IP: 192.168.0.225 (ESP-Easy-0) GW: 192.168.0.254 SN: 255.255.255.0   duration: 11 ms
18569308 : EVENT: WiFi#Connected
18569311 : WIFI  : Arduino wifi status: WL_CONNECTED ESPeasy internal wifi status: ESPEASY_WIFI_SERVICES_INITIALIZED
22:25:20: LmacRxBlk:1
LmacRxBlk:1
22:25:22: LmacRxBlk:1
18574417 : MQTT : Failed to connect to broker
LmacRxBlk:1
22:25:24: LmacRxBlk:1
LmacRxBlk:1
22:25:26: LmacRxBlk:1
22:25:27: LmacRxBlk:1
LmacRxBlk:1
22:25:29: LmacRxBlk:1
ping 1, timeout 1, total payload 0 bytes, 1024 ms
LmacRxBlk:1
22:25:30: ping 1, timeout 1, total payload 0 bytes, 974 ms
22:25:32: LmacRxBlk:1
ping 1, timeout 1, total payload 0 bytes, 949 ms
18583568 : Host unreachable: 0.0.0.0
18583569 : Wifi  : Detected strange behavior, reconnect wifi.
state: 5 -> 0 (0)
rm 0
18583570 : WIFI  : Arduino wifi status: WL_IDLE_STATUS ESPeasy internal wifi status: ESPEASY_WIFI_SERVICES_INITIALIZED
18583578 : NTP  : NTP host au.pool.ntp.org (0.0.0.0) unreachable
18583598 : EVENT: WiFi#Disconnected
18583601 : WIFI : Disconnected! Reason: '(8) Assoc leave' Connected for 14 s
18583608 : WIFI : Connecting MAD_IOT attempt #0
18583609 : IP   : Static IP : 192.168.0.225 GW: 192.168.0.254 SN: 255.255.255.0 DNS: 8.8.8.8
scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 1
cnt 
22:25:40: 
 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 1384, room 16 
tail 8
chksum 0x2d
csum 0x2d
v614f7c32
~ld
ªU102 : 

INIT : Booting version: mega-20180502 (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 1.4.0-RC2)
102 : INIT : Warm boot #1
4295585 : WD   : Uptime 16 ConnectFailures 0 FreeMem 15616
22:41:50: 4296872 : NTP  : NTP host au.pool.ntp.org (103.38.120.36) queried
4296924 : NTP  : NTP replied: 51 mSec
4296925 : EVENT: Time#Set
4296927 : EVENT: Time#Set Processing time:2 milliSeconds
ping 1, timeout 0, total payload 32 bytes, 1045 ms
22:41:56: 
 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 1384, room 16 
tail 8
chksum 0x2d
csum 0x2d
v614f7c32
~ld
ªU102 : 

INIT : Booting version: mega-20180502 (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 1.4.0-RC2)
102 : INIT : Warm boot #2
22:41:50: 4296872 : NTP  : NTP host au.pool.ntp.org (103.38.120.36) queried
4296924 : NTP  : NTP replied: 51 mSec
4296925 : EVENT: Time#Set
4296927 : EVENT: Time#Set Processing time:2 milliSeconds
ping 1, timeout 0, total payload 32 bytes, 1045 ms
22:41:56: 
 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 1384, room 16 
tail 8
chksum 0x2d
csum 0x2d
v614f7c32
~ld
ªU102 : 

INIT : Booting version: mega-20180502 (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 1.4.0-RC2)
TD-er commented 6 years ago

This sounds a bit like the timing issues I tried to fix. Reverting back to LWIP 2.0 will cause a lot of other issues, so that's better to leave at 1.4 And when you change that one, you have to do a clean build, or else you'll end up with something that may let you draw incorrect conclusions.

My daughter is at bed now, so I'll have the evening for debugging.

TD-er commented 6 years ago

This change broke the rules: https://github.com/letscontrolit/ESPEasy/commit/5bc2af0e4ceb9f9ba0642c5b95a11c4ab105e8c5

Whenever a plugin_write returns success = true, the execute command will no longer be called. It doesn't matter whether the plugin is enabled or present. It should only return success = true when the command was actually meant for this plugin.

TD-er commented 6 years ago

@Oxyandy Can you please try also to set the LWIP parameter in the PlatformIO.ini to the LWIP2_LOW_MEMORY ? Like this:

[common]
build_flags      = -D BUILD_GIT='"${env.TRAVIS_TAG}"'  ; ${compiler_warnings.build_flags}
                   -D PIO_FRAMEWORK_ARDUINO_LWIP2_LOW_MEMORY

The LWIP2_HIGH_BANDWIDTH is causing all kinds of issues like we've seen the last few days. Build 20180502 was using LWIP 1.4 (LWIP_HIGH_BANDWIDTH) I have done some tests here and the LWIP2_LOW_MEMORY at least does not mangle the rules when saving them and does appear to work fine, as far as I can tell by testing on my nodes that don't experience a lot of the reported issues :(

Oxyandy commented 6 years ago
INIT : Booting version:  (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3)
102 : INIT : Cold Boot
104 : FS   : Mounting...
109 : FS   : Mount successful, used 75802 bytes of 113201
112 : CRC  : No program memory checksum found. Check output of crc2.py
120 : CRC  : SecuritySettings CRC   ...OK 
135 : INIT : Free RAM:22872
136 : INIT : I2C
136 : INIT : SPI not enabled
151 : INFO : Plugins: 47 [Normal] (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3)
152 : EVENT: System#Wake
155 : WIFI : Set WiFi to STA
mode : sta(5c:cf:7f:72:96:ec)
add if0
07:24:04: 188 : WIFI : Connecting MAD_MOB attempt #0
201 : EVENT: System#Boot
205 : SW   : Switch state 1 Output value 1
207 : EVENT: #Switch=1.00
1020 : WD   : Uptime 0 ConnectFailures 0 FreeMem 22128
07:24:07: scandone
3049 : EVENT: WiFi#Disconnected
3052 : WIFI : Disconnected! Reason: '(201) No AP found' Connected for 2861 ms
3205 : WIFI : Connecting MAD_MOB attempt #1
07:24:10: scandone
6070 : EVENT: WiFi#Disconnected
6072 : WIFI : Disconnected! Reason: '(201) No AP found' Connected for 2864 ms
6205 : WIFI : Connecting MAD_MOB attempt #2
.
.
126206 : WIFI : Connecting MAD_MOB attempt #26
07:26:13: scandone
129071 : EVENT: WiFi#Disconnected
129074 : WIFI : Disconnected! Reason: '(201) No AP found' Connected for 2865 ms
TD-er commented 6 years ago

OK, that's a no-go. Thanks for testing :)

Oxyandy commented 6 years ago

How frustrating for you :( WIFI : Connecting MAD_MOB attempt #148 Tried cold booting etc

TD-er commented 6 years ago

Not really, now I know wat went wrong the last couple of days and I like computer stuff to be a bit predictable :)

Oxyandy commented 6 years ago

Before I compiled I did a fetch.. just now I noticed 2 updated files Command.ino & _P059_Encoder.ino Should they have been included..? meh, doing clean build

Oxyandy commented 6 years ago
104 : INIT : Cold Boot
105 : FS   : Mounting...
111 : FS   : Mount successful, used 75802 bytes of 113201
114 : CRC  : No program memory checksum found. Check output of crc2.py
121 : CRC  : SecuritySettings CRC   ...OK 
137 : INIT : Free RAM:22872
137 : INIT : I2C
137 : INIT : SPI not enabled
153 : INFO : Plugins: 47 [Normal] (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3)
153 : EVENT: System#Wake
156 : WIFI : Set WiFi to STA
mode : sta(5c:cf:7f:72:96:ec)
add if0
189 : WIFI : Connecting MAD_MOB attempt #0
08:23:02: 202 : EVENT: System#Boot
206 : SW   : Switch state 1 Output value 1
208 : EVENT: #Switch=1.00
1020 : WD   : Uptime 0 ConnectFailures 0 FreeMem 22128
08:23:05: scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 1
cnt 

connected with MAD_MOB, channel 7
dhcp client start...
3988 : WIFI : Connected! AP: MAD_MOB (18:90:D8:AC:0F:D8) Ch: 7 Duration: 3798 ms
3989 : EVENT: WiFi#ChangedAccesspoint
08:23:06: ip:192.168.0.103,mask:255.255.255.0,gw:192.168.0.254
4023 : WIFI : DHCP IP: 192.168.0.103 (ESP-Easy-0) GW: 192.168.0.254 SN: 255.255.255.0   duration: 34 ms
4034 : EVENT: WiFi#Connected
4039 : Webserver: start
4040 : WIFI  : Arduino wifi status: WL_CONNECTED ESPeasy internal wifi status: ESPEASY_WIFI_SERVICES_INITIALIZED
4135 : MQTT : Intentional reconnect
4184 : MQTT : Connected to broker with client ID: ESPClient_5C:CF:7F:72:96:EC
4184 : Subscribed to: domoticz/out
4186 : EVENT: MQTT#Connected
4189 : EVENT: MQTT#Connected Processing time:3 milliSeconds
4312 : NTP  : NTP host au.pool.ntp.org (103.38.120.36) queried
4364 : NTP  : NTP replied: 51 mSec
4366 : Current Time Zone: STD time start: 2018-10-28 01:00:00 offset: 600 min
4366 : EVENT: Time#Initialized
4368 : EVENT: Time#Initialized Processing time:2 milliSeconds
4373 : EVENT: Clock#Time=Thu,08:23
4376 : EVENT: Clock#Time=Thu,08:23 Processing time:3 milliSeconds
08:23:07: ping 1, timeout 0, total payload 32 bytes, 1046 ms
08:23:16: pm open,type:2 0
08:23:33: 31020 : WD   : Uptime 1 ConnectFailures 0 FreeMem 13200
Oxyandy commented 6 years ago

Connecting quickly 5 cold boots looks like doing a fetch in Github Desktop is delayed, versus realtime github server content ????

TD-er commented 6 years ago

Sigh...... ;) I guess we should experiment with both LWIP2 and 1.4 but leave the LWIP2.0 high bandwidth alone for a while....

And when switching between LWIP versions, you should do a clean build.

Oxyandy commented 6 years ago

Yeah I only build - normal 1024, so always do clean (should, habits will improve) [SUCCESS] Took 92.36 seconds

TD-er commented 6 years ago

There is something fishy when changing between LWIP versions. I've seen similar things here, that a second build (which was then after a clean) does behave different compared to the first compile.

Oxyandy commented 6 years ago

So I have tested and serial commands now work fine.. I have technically moved on past 0502, so Closed ? Any other issue that remains in next build, will be reported accordingly. The new build I compiled with your instruction has been working well, no more LmacRxBlk:1 errors in log, no reboots & uptime is stable. Tried the F5 abuse test - One word, PASSED (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3) A good solid 3-4 minutes worth on devices page. After that I got to try: Today's Release mega-20180503 (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 1.4.0-RC2) I was able to see LmacRxBlk:1 in log quickly when doing test, not much more to say - as limited time..

ghtester commented 6 years ago

Unfortunately the latest build 20180503 does not work for me - the ESP web does not open , instead it stops responding in serial console. Settings reset to defaults and set only WifiSSID and key through serial console. PING works. No reboots, no error message.

Oxyandy commented 6 years ago

Ok @ghtester - I had better try issue 'reset' command - Try this build from scratch.. Do you self compile or is this release ? Thanks

Oxyandy commented 6 years ago

@ghtester Yes, confirmed I took 2 tries to set SSID & PASS via http://192.168.4.1, after that http://192.168.0.103/config would never load Serial froze, I walked away After 5 minutes serial spat out 372381 : WIFI : Disconnected! Reason: '(3) Auth leave' Connected for 5 m 2 s And became alive again on serial console

TD-er commented 6 years ago

This is with LWIP 1.4? Thus the latest build?

Oxyandy commented 6 years ago

Yes @TD-er this is downloaded today mega-20180503 (ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 1.4.0-RC2) ESP_Easy_mega-20180502_normal_ESP8266_1024.bin

Oxyandy commented 6 years ago

ESP82xx Core 2_4_1, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3 was excellent, as I commented above, why LWIP 1.4, just to test... ? Is this the home of LWIP 1.4 ?

ghtester commented 6 years ago

@Oxyandy Yes that's what I did - reset settings to defaults through serial console. The official release mega-20180503 4096 dev, I don't build my own.

TD-er commented 6 years ago

@Oxyandy It was just that I did not want to change too much at once.