letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.25k stars 2.2k forks source link

One WEMOS device crashes keeps rebooting wirh 20193005 #2393

Closed berbergh closed 4 years ago

berbergh commented 5 years ago

Checklist

I have...

Steps already tried...

If you self compile, please state this and PLEASE try to ONLY REPORT ISSUES WITH OFFICIAL BUILDS!

Summarize of the problem/feature request

I have several WEMOS D1 mini devices. They all work fine with ESP_Easy_mega-20190311_test_core_242_ESP8266_4M.bin Except for one, It simply keeps rebooting with ESP_Easy_mega-20190305_test_core_242_ESP8266_4M.bin and later.

ESP_Easy_mega-20190227_test_core_242_ESP8266_4M.bin wors okay.

Expected behavior

Update should work fine

Actual behavior

Device keeps rebooting

Steps to reproduce

  1. have ESP_Easy_mega-20190227_test_core_242_ESP8266_4M.bin
  2. upload ESP_Easy_mega-20190311_test_core_242_ESP8266_4M.bin

System configuration

Hardware: WeMos D1 with MHZ19 and OLED

ESP Easy version: ESP_Easy_mega-20190305_test_core_242_ESP8266_4M.bin

ESP Easy settings/screenshots:

Rules or log data


wdt reset
load 0x4010f000, len 1384, room 16 
tail 8
chksum 0x2d
csum 0x2d
vbb28d4a3
~ld
5⸮U80 : 

INIT : Booting version: mega-20190311 (ESP82xx Core 2_4_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3 PUYA support)
81 : INIT : Free RAM:25304
81 : INIT : Warm boot #32 - Restart Reason: Hardware Watchdog
84 : FS   : Mounting...
109 : FS   : Mount successful, used 76053 bytes of 957314
513 : CRC  : program checksum       ...OK

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 1384, room 16 
tail 8
chksum 0x2d
berbergh commented 5 years ago

Now that is interesting. I let one of the GPIO ports pulse for some 30 seconds. It seems, that blocks the entire device?

on System#Boot do
  gpio,16,0
  timerSet,1,60
endon

On Rules#Timer=1 do
  SendToHTTP URL,8080,/json.htm?type=command&param=switchlight&idx=531&switchcmd=Toggle
  Pulse,16,1,30000
  SendToHTTP URL,8080,/json.htm?type=command&param=switchlight&idx=531&switchcmd=Toggle
  timerSet,1,600
endon

on CO2#PPM do
  sendtohttp URL,8080,/json.htm?param=udevice&type=command&idx=624&nvalue=[CO2#PPM]&svalue=0
endon
TD-er commented 5 years ago

Yep, see the difference explained here

This pulse may yield strange behavior. I am also not sure if it does allow for handling network traffic. I would split that one in 2 parts where you use an extra timer to enter the 2nd part of the sequence.

berbergh commented 5 years ago

Okay ... I missed out on that. Still a bit weird though. By the time there were problems with LongPulse. But that is long time ago.

Anyhow, I changed the Rules.

TD-er commented 5 years ago

Maybe also have a look at the set timeout in the controllers. For local networks and based on your timing stats, I would say 100 msec timeout is more than enough.

berbergh commented 5 years ago

Do you mean Cliënt time Out?

TD-er commented 5 years ago

Yep, the bottom one in this screenshot: image

You may also try to lower the Minimum Send Interval if the host is able to keep up.

berbergh commented 5 years ago

Last night, Nextion crashed. I will flash it with your second upload.

berbergh commented 5 years ago

Okay, I think something is wrong with the LongPulse. when I send http://URL/control?cmd=longpulse,12,1,5 to a device, the GPIO pin stays 1 forever. On the other hand, when I send http://URL/control?cmd=pulse,12,1,5000, I get a beautiful 5 second 1 on that pin. That was the reason in the first place, I used the Pulse command in the rules.

This same behaviour can be observed in the Rules.

I followed your recommendation and added another timer.

berbergh commented 5 years ago

Nextion also crashed with the latest upload, three times now.

TD-er commented 5 years ago

And the timing stats of the Nextion?

berbergh commented 5 years ago
Description Function #calls call/sec min (ms) Avg (ms) max (ms)
P_37_Generic - MQTT Import ONCE_A_SECOND 640 2.00 0.013 0.041 0.096
P_37_Generic - MQTT Import TEN_PER_SECOND 6388 19.96 0.028 0.142 12.976
P_37_Generic - MQTT Import WRITE 20 0.06 0.010 0.021 0.039
P_37_Generic - MQTT Import FIFTY_PER_SECOND 31538 98.55 0.010 0.011 0.058
P_75_Display - Nextion [TESTING] READ 6 0.02 93.448 95.203 98.204
P_75_Display - Nextion [TESTING] ONCE_A_SECOND 320 1.00 0.030 0.033 0.062
P_75_Display - Nextion [TESTING] TEN_PER_SECOND 3194 9.98 0.108 0.156 0.210
P_75_Display - Nextion [TESTING] WRITE 10 0.03 0.696 3.459 6.699
P_75_Display - Nextion [TESTING] FIFTY_PER_SECOND 15769 49.27 0.026 0.039 0.080
C_2_Domoticz MQTT CPLUGIN_PROTOCOL_RECV 621 1.94 0.513 0.807 1.022
Load File   310 0.97 1.228 1.705 3.516
Loop   986069 3081.21 0.247 0.319 269.732
Plugin call 50 p/s   15769 49.27 1.412 1.585 4.974
Plugin call 10 p/s   3194 9.98 1.653 2.011 24.064
Plugin call 10 p/s U   3194 9.98 0.052 0.061 0.116
Plugin call 1 p/s   320 1.00 1.953 5.506 136.517
SensorSendTask()   6 0.02 96.593 100.936 104.984
setNewTimerAt()   20656 64.54 0.136 0.152 0.277
hostByName()   5 0.02 0.120 0.653 2.572
connectClient()   5 0.02 3.829 5.323 7.701
LoadCustomTaskSettings()   66 0.21 3.812 3.959 5.708
WiFi.isConnected()   2943578 9197.91 0.018 0.025 0.200
LoadTaskSettings()   244 0.76 3.426 4.080 8.159
TryOpenFile()   396 1.24 0.298 0.637 2.357
rulesProcessing()   43 0.13 78.489 87.463 134.020
sendGratuitousARP()   64 0.20 0.436 0.592 0.683
backgroundtasks()   1952944 6102.44 0.053 0.070 269.100
handle_schedule() idle   965413 3016.66 0.095 0.115 269.186
handle_schedule() task   20656 64.54 0.281 2.008 136.752
Start Period: 2019-08-25 20:56:26
Local Time: 2019-08-25 21:01:46
Time span: 320.03 sec
TD-er commented 5 years ago

Only thing I see is that the Nextion plugin does need about 100 msec, every time it is called (which is not that often) The average time for rules processing is 84 msec. But nothing else stands out here. so I think the timing stats do not indicate something really wrong here.

berbergh commented 5 years ago

Nextion plugin is not doing very much. Only some updating. Rules contain a small timed sending a json string to domoticz once a minute.

berbergh commented 5 years ago

Status update.

Build:⋄ 20103 - Mega System Libraries:⋄ ESP82xx Core 2_5_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.1.2 PUYA support Git Build:⋄ mega-20190827 Plugins:⋄ 80 [Normal] [Testing] Build Time:⋄ Aug 27 2019 02:24:15 Binary Filename:⋄ ESP_Easy_mega-20190827_test_core_252_ESP8266_4M.bin

This build runs for more than 3 days on several devices, without a flay, unless I also activate MQTT.

TD-er commented 5 years ago

The latest build with core 2.6.0 (SDK 2.2.2) should be better with MQTT disconnects according to the guys from Tasmota.

berbergh commented 5 years ago

Hi, I am still a bit puzzled about the best core to use. Why all these flavours, why the various SDKs? The answer is probably too technical for me, but what can you recommend?

TD-er commented 5 years ago

Well the main reason of all these core versions is that I also don't know.

Core 2.6.0 SDK 3 should not be used right now, since there were lots of reported issues (among them are a few from me) and that's why they backported SDK 2.2.2 to be used with core 2.6.0. Core 2.6.0 has made quite some steps in the right direction, but as always the non-released core versions may become unstable due to lack of testing.

Core 2.5.2 is one of the most stable ones I think. Core 2.4.2 is the least stable. Core 2.4.1 may also be one to try, since that one is using a different scheduler in the core version.

Core 2.3.0 was one of the more stable ones and I regret it a lot that we left that one for some major speed improvements and it is currently impossible to build ESPeasy with that core version. (tried it a few times and you can build one but it will crash)

uzi18 commented 5 years ago

Is is possible for you to disable rules and check if any change in behaviour?

berbergh commented 5 years ago

Is is possible for you to disable rules and check if any change in behaviour?

Yes, no problem. Unfortunately, with the suggested 2.6.0 SDK, my nextion Wemos crashed within minutes ...

@TD-er, thanks.

berbergh commented 5 years ago

Well, disabling Rules in Advanced settings, hardly changes anything.

Update, Even without MQTT, the device hangs.

berbergh commented 5 years ago

Does one have to erase memory en re-install devices after migrating from 2.5.2 to 2.6.0?

TD-er commented 5 years ago

That should not be necessary. But when in doubt you can backup your current files and try.

berbergh commented 5 years ago

Just an update. I have the Nextion WEMOS running with the mega-20190830 2.5.2 core build for the last 5 days, without MQTT without a problem. All other devices, except for one, run on the same firmware. The one exception runs on the mega-20190830 260/222 firmware, also for 5 days. It is a Wemos with only a DHT22. When I enable MQTT, it crashes within an hour. Another WEMOS with DS18b20 and DHT22 crashed within the hour on 260/222 even without MQTT. I now runs on 252.

berbergh commented 4 years ago

So far, all devices are stable with   | ESP82xx Core 2_5_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.1.2 PUYA support -- | --

Only the one with MQTT keeps crashing. I am starting to believe, the original problem is no longer a problem. I only do not understand why I have a Wemos with MQTT that keeps crashing, while so many other users seem to have no issues at all.

TD-er commented 4 years ago

MQTT does generate a lot more network traffic. Meaning if the crashes are network related, the ones running MQTT have a higher chance of crashing.

Just make sure WiFi no sleep is checked and Gratuitous ARP is checked. And you can also try the last build (really one of last night or tomorrow's build) based on core 2.6.0. Yesterday some fix was merged to core 2.6.0 which doesn't even need Gratuitous ARP anymore.

uzi18 commented 4 years ago

@berbergh did you tried to complete erase flash and flash one more time tested firmware

berbergh commented 4 years ago

@TD-er what version should I then use? Now I use

Build:⋄ 20104 - Mega
System Libraries:⋄ ESP82xx Core 2_5_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.1.2 PUYA support
Git Build:⋄ mega-20190926
Plugins:⋄ 78 [Normal] [Testing]
Build Md5: 6ca0fb263874c1c7a5995cd1f797c1d
Md5 check: passed.
Build Time:⋄ Sep 26 2019 02:30:38
Binary Filename:⋄ ESP_Easy_mega-20190926_test_ESP8266_4M_VCC.bin
berbergh commented 4 years ago

@uzi18 No, not yet. I will try this later.

TD-er commented 4 years ago

You're now using core 2.5.2 So take one with core 2.6.0 on the name.

berbergh commented 4 years ago

@TD-er I am sorry to say, that there aren't any bearing 2.6.0 in its name in the bin section of Release mega-20190926.zip

TD-er commented 4 years ago

I will merge my bugfixes for the build I made yesterday, so there will be core 2.6.0 in tomorrow's build.