Closed berbergh closed 6 years ago
I've just tested the 20180908 on a nodeMCU updating from a former version. No sensors or actors connected (blank config).
Software-reboot, reset button and power toggle are booting without any issue.
I'd suggest to disconnect the items, disable the tasks and then try booting. Then reactivate items one by one to see which one causes the boot loop.
@ShardanX What you suggest is exactly what I did. Please see Steps to reproduce. That is why I think MQTT causes the problem, but only during boot-up after installing 20180908.
Have you also tried the nightly build? In the start post you mention to be using Arduino 1.8.5
I really don't get it, for about a month it has been tested with numerous test builds and over 100 posts on the pull request before it has been merged and the first day after it has been merged some reports of infinite boot loops.
Here the same (self-compiled) with boot loop and Exception (28). Curious: Only 1 of 16 nodes make this!? After the whole day of searching the reason, it ended in this: [Watchdog] Add watchdog feed to backgroundtasks() function
After removing it, all is ok again. But I must say, I did not implement that new controller stuff, because I have no issues with my controllers.... ;-)
@v-a-d-e-r Thanks for noticing this. I will remove that watchdog feed.
I reverted the mentioned change, so hopefully it is now working stable again.
I will test it tomorrow, but how does this relate to MQTT?
Op zo 9 sep. 2018 00:02 schreef Gijs Noorlander notifications@github.com:
I reverted the mentioned change, so hopefully it is now working stable again.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/letscontrolit/ESPEasy/issues/1722#issuecomment-419675701, or mute the thread https://github.com/notifications/unsubscribe-auth/ADgmGHVMQ18jDg3H5kn1P2CXCVm5BQAlks5uZD56gaJpZM4Wf5MK .
Maybe it doesn't. You had concluded it may be related to MQTT. I see also Nextion in the logs and this issue is all about Nextion and Exception 28: https://github.com/letscontrolit/ESPEasy/issues/1643 So maybe it isn't MQTT related.
I use this build. Now the 20180909. https://github.com/letscontrolit/ESPEasy/releases I download the ESPEasy_mega-20180909.zip file. Upack it in my Arduino IDE environment and compile it. Thereafter I load it on the ESP hardware via serial USB.
This newest version gives the same error.
I do not understand wht you think it has to do with Nextion.
This is what I find with the 28 exception error: https://arduino-esp8266.readthedocs.io/en/latest/exception_causes.html 28 | LoadProhibitedCause | A load referenced a page mapped with an attribute that does not permit loads | Region Protection or MMU | Yes |
---|
But you probably already know that.
When the MQTT is disabled, uploading goes fine.
Hereunder the serial output after uploading 20180909 with MQTT disabled: ⸮U13300 : WIFI : Set WiFi to STA 13333 : WIFI : Connecting Spoon3 attempt #0 13333 : IP : Static IP : 192.168.0.76 GW: 192.168.0.1 SN: 255.255.255.0 DNS: 192.168.0.1 13446 : EVENT: System#Boot 13452 : ACT : Nextion,page3.bo.txt="---" 13479 : NEXTION075 : WRITE, Command is page3.bo.txt="---" 13483 : ACT : Nextion,page3.b1.txt="---" 13508 : NEXTION075 : WRITE, Command is page3.b1.txt="---" 13513 : ACT : Nextion,page3.b5.txt="---" 13538 : NEXTION075 : WRITE, Command is page3.b5.txt="---" 13592 : NEXTION075 : Cmd Statement Line-1 Sent: page0.j0.val=0 13626 : NEXTION075 : Cmd Statement Line-2 Sent: page0.vTime.txt="00:00:13" 13652 : NEXTION075 : Cmd Statement Line-3 Sent: page0.vTO.txt="" 13674 : NEXTION075 : Cmd Statement Line-4 Sent: page0.vHO.txt="" 13700 : NEXTION075 : Cmd Statement Line-5 Sent: page0.vSold.txt="" 13725 : NEXTION075 : Cmd Statement Line-6 Sent: page0.vSolt.txt="" 14902 : WD : Uptime 0 ConnectFailures 0 FreeMem 22616 17379 : WIFI : Connected! AP: Spoon3 (DC:EF:09:CD:67:93) Ch: 12 Duration: 4043 ms 17380 : EVENT: WiFi#ChangedAccesspoint 17424 : IP : Static IP : 192.168.0.76 GW: 192.168.0.1 SN: 255.255.255.0 DNS: 192.168.0.1 17426 : WIFI : Static IP: 192.168.0.76 (E16Nextion-16) GW: 192.168.0.1 SN: 255.255.255.0 duration: 48 ms 17437 : EVENT: WiFi#Connected 17475 : Webserver: start 17538 : MQTT : Intentional reconnect 17571 : MQTT : Connected to broker with client ID: ESPClient_5C:CF:7F:C3:AC:5F 17572 : Subscribed to: domoticz/out 17574 : EVENT: MQTT#Connected 18360 : Current Time Zone: DST time start: 2018-03-25 02:00:00 offset: 120 minSTD time start: 2018-10-28 03:00:00 offset: 60 min 18361 : EVENT: Time#Initialized 18417 : EVENT: Clock#Time=Sun,13:38 74902 : WD : Uptime 1 ConnectFailures 0 FreeMem 17944 134902 : WD : Uptime 2 ConnectFailures 0 FreeMem 17944
Hereunder the same with MQTT enabled. ⸮U Exception (28): epc1=0x4020b58e epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000
ctx: cont sp: 3ffff830 end: 3fffffd0 offset: 01a0
stack>>> 3ffff9d0: feefeffe feefeffe feefeffe feefeffe
3ffff9e0: feefeffe feefeffe feefeffe feefeffe
3ffff9f0: feefeffe feefeffe feefeffe feefeffe
3ffffa00: feefeffe 3ffffaf3 feefeffe feefeffe
3ffffa10: feefeffe feefeffe feefeffe 4026e7b8
3ffffa20: 3ffffb50 3ffe8308 3ffffab0 4026a2d5
After a while it hangs. After reset I get
3fffff90: 00000000 00000000 00000000 402677f0
3fffffa0: 3fff500c 0000005f 00000015 feefeffe
3fffffb0: 3fffdad0 00000000 3fff323a 402665e4
3fffffc0: feefeffe feefeffe 3ffe87e4 40100739
<<<stack<<<
ets Jan 8 2013,rst cause:2, boot mode:(3,6)
load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d vbb28d4a3 ~ld ⸮U Exception (28): epc1=0x4020b58e epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000
ctx: cont sp: 3ffff830 end: 3fffffd0 offset: 01a0
stack>>> 3ffff9d0: feefeffe feefeffe feefeffe feefeffe
3ffff9e0: feefeffe feefeffe feefeffe feefeffe
3ffff9f0: feefeffe feefeffe feefeffe feefeffe
3ffffa00: feefeffe 3ffffaf3 feefeffe feefeffe
3ffffa10: feefeffe feefeffe feefeffe 4026e7b8
please paste here output from stack decoder: https://github.com/me-no-dev/EspExceptionDecoder
When I click the Exception Decoder, I get this: ERROR: xtensa-lx106-elf-gdb.exe not found!
Same problem! That has already cost me half the day. Setting up ESPEasy on Wemos D1 Mini with a LCD2004 works fine. When I add the MQTT import plugin, it works until I reconnect the wemos to the power. The display will then flash and "ESP Easy" will appear on the display again and again. It has to be something with MQTT or the MQTT import plugin, because nothing works until after adding the device. To solve the problem only helps to re-flash the Wemos D1 Mini. Serial monitor is about the same as the people before. Thank you
Same problem! build dev_090918 only MQTT import added (and OpenHAB MQTT) all works before reboot. After reboot - not respond. First discovered in build 080918. blank.bin later 090918_dev.bin
Maybe I am missing it here, but has anyone also tried it with a nightly build from us? Or is it all on self-builds?
So, adding MQTT import is all that is needed (plus reboot) to make it crash repeatedly?
@TD-er Sorry I forgot to say that. I have the newest software from GitHub without changes and it comes to the error when I add the plugin MQTT import and reboot.
Tested: ESP_Easy_mega-20180909_dev_ESP8266_4096.bin and: ESP_Easy_mega-20180909_normal_ESP8266_4096.bin
OK, I will have a look at it now.
@berbergh
I do not understand wht you think it has to do with Nextion.
You have Nextion in the logs and I already linked to another issue, which is all about issues with Nextion and Exception 28 reboots. Also the Nextion line is the last one in your logs right before the crash. So I guess it isn't so strange to think of the Nextion, is it?
Also when you write "disable MQTT" or "enable MQTT", I don't know what you mean. We have several controllers using MQTT and those (all controllers) are the ones I changed in the last big merge. There is also a "MQTT import" plugin, which is something I hardly did anything in the source code. The controller changes have been tested very thoroughly the last month.
So please be a little bit more elaborate in the issue report. For example, I have now on my test board a number of controllers active, with one of them Domoticz MQTT. And also the MQTT import is loaded (but not yet active) and after reboot it is still running. I will now continue to extend the tests, to see when it will break. But can you please make a bit more clear what it is you have configured and how it is configured?
In my case: (sonoff with 4MB spi flash - replaced) 1) I flash 4MB blank, repower 2) I flash ESP_Easy_mega-20180910_dev_ESP8266_4096.bin, repower 3) Connect to ESP_Easy_0, enter my SSID and password (pass > 32 characters), repower, reconnect in my wifi infrastructure 4) Add generic-MQTT import, add name "MQTTin", checkbox "Enabled", Submit, reboot (from interface). 5) Device is not responding.
Same on 20180904_dev work propertly. From 20180908 - not work.
p.s. on esp8285 (with hardware 1MB) - same as 4MB.
Sorry for the confusion. I thought I gave you all the details, by following the guidelines while opening an issue. I will elaborate more on this later today. But, the ESPEasy I am referring to is a WeMos D1 mini with a Nextion and a MQTT import device active. On it I had Release mega-20180904 and it was running stable for several days. When I updated to Release mega-20180908, the problems showed up. So I started experimenting and opened an issue. When I write 'enable MQTT' I mean that I check the enable box in the MQTT import device. When I write 'disable' MQTT' I mean that I uncheck the MQTT import device.
Tonite I wil test it with the pre compiled firmware. If you have other test ideas? I am willing to help.
I also got it to reproduce. Since it is crashing before anything is sent to output, it looks like saving of that specific plugin might overwrite something else. This evening I will try to test if saving anything else after adding (and enabling) the MQTT import will change behavior.
I guess we could do a dump of the settings file before and after the save and then compare the two.
This one becomes more relevant again: https://github.com/letscontrolit/ESPEasy/issues/1616
Yes, would be nice.
Hey, sorry for asking but is there any news? Thanks! :)
The "news" is that I did add some tests to check if the pointer is NULL before dereferencing it. But still no success. At this moment, I just recovered from the N^th complete flash-erase to start over.
So it reproduces very well, but no solution yet. At this moment I haven't got a clue what is going on, since it crashes right before any log is output.
I still have not found a solution, but I created a recovery option. See https://github.com/letscontrolit/ESPEasy/pull/1740
This will at least help to get into the node again by disabling one by one all plugins/controllers/notifications.
And it helped me to regain control again of the node, without erasing.
I also confirm that also in version 20180914, if MQTT Import plugin is installed the unit constantly reboots for 10 times then it boots correctly but with MQTT Import plugin disabled.
I ran into this problem with my own custom plugin, which I'm currently developing and which is based on the mqtt import plugin. I tracked the issue down to the MQTT subscription function, thanks to the EspStackTraceDecoder.
It appears that trying to access the first enabled mqtt controller in PLUGIN_INIT is the culprit (or looping through all tasks in MQTTSubscribe_037, I'm not really sure, but at least something doesn't seem to be properly initialized in PLUGIN_INIT). I was able to solve the problem by moving the whole MQTT connection setup from PLUGIN_INIT to PLUGIN_ONCE_A_SECOND. If the MQTTclient object has not been created (i.e. is NULL) in PLUGIN_ONCE_A_SECOND, then I create the PubSubClient, connect and subscribe to MQTT. This also makes sense, as in my tests, the wifi connection was never set up in PLUGIN_INIT, so the mqtt connection couldn't be established at that time anyway.
I haven't run into a crash since then.
Here are the changes that I did to my own plugin: https://github.com/kainhofer/ESPEasy/commit/77e1d0cecf90f7ffc2974b175af790d9445e0099#diff-ab0811475cb909ce7e34570752cd908c And this is the full code of the plugin (under development and unfinished, but the crash is gone): https://github.com/kainhofer/ESPEasy/blob/RFLink-MQTT-Gateway/src/_P197_RFLink_MQTT_Bridge.ino
@kainhofer thanks! I will look into this when I get home again.
Can someone tell me which version is the last version in which the MQTT import plugin still works? Then I would use for the transition until a solution is found using this version. Thanks :)
which version is the last version in which the MQTT import plugin still works?
20180904
Should be fixed by PR #1777
Can confirm that there is no more reboots when MQTT import plugin is enabled and node is rebooted, I have compiled it using platformio with PIO Build (dev_ESP8266_4096).
After CMD reboot :
WIFI : DHCP IP: 192.168.1.7 (ESP-Easy-0) GW: 192.168.1.1 SN: 255.255.255.0 duration: 6040 ms 10644 : Webserver: start 10741 : MQTT : Intentional reconnect 10908 : MQTT : Connected to broker with client ID: ESPClient_5C:CF:7F:4C:56:DD 10910 : Subscribed to: NextionDS/# 11195 : Current Time Zone: DST time start: 2018-03-25 02:00:00 offset: 120 minSTD time start: 2018-10-28 03:00:00 offset: 60 min 11322 : IMPT : Connected to MQTT broker with Client ID=ESP_Easy-Import
11334 : IMPT : [mqtt1#Value1] subscribed to NextionDS/ph/PHVal 11338 : IMPT : [mqtt1#Value2] subscribed to NextionDS/RAM/RAM 11559 : IMPT : [mqtt1#Value1] : 11.30 11758 : IMPT : [mqtt1#Value2] : 15632.00
Great! Thanks for testing
currently running for more than 11h with enabled 2x MQTT import plugin and 2x Nextion plugin enabled.
40793072:` NEXTION075 : Cmd Statement Line-1 Sent: main.t2.txt='15064' 40802055: WD : Uptime 680 ConnectFailures 4 FreeMem 15984 40803312: NEXTION075 : Cmd Statement Line-1 Sent: main.t1.txt='17:40' 40803342: NEXTION075 : Cmd Statement Line-2 Sent: ec.t4.txt='1102' 40803376: NEXTION075 : Cmd Statement Line-3 Sent: main.t0.txt='25.1' 40803406: NEXTION075 : Cmd Statement Line-5 Sent: main.t3.txt='-50' 40803442: NEXTION075 : Cmd Statement Line-7 Sent: temp.t2.txt='25.0' 40803477: NEXTION075 : Cmd Statement Line-8 Sent: temp.t3.txt='36.3' 40803512: NEXTION075 : Cmd Statement Line-9 Sent: temp.t0.txt='25.1' 40803548: NEXTION075 : Cmd Statement Line-10 Sent: temp.t1.txt='23.4' 40813073: NEXTION075 : Cmd Statement Line-1 Sent: main.t2.txt='15064' 40818311: NEXTION075 : Cmd Statement Line-1 Sent: main.t1.txt='17:40' 40818342: NEXTION075 : Cmd Statement Line-2 Sent: ec.t4.txt='1102' 40818380: NEXTION075 : Cmd Statement Line-3 Sent: main.t0.txt='25.1' 40818409: NEXTION075 : Cmd Statement Line-5 Sent: main.t3.txt='-50' 40818443: NEXTION075 : Cmd Statement Line-7 Sent: temp.t2.txt='25.0' 40818479: NEXTION075 : Cmd Statement Line-8 Sent: temp.t3.txt='36.3' 40818515: NEXTION075 : Cmd Statement Line-9 Sent: temp.t0.txt='25.1' 40818551: NEXTION075 : Cmd Statement Line-10 Sent: temp.t1.txt='23.4' 40832054: WD : Uptime 681 ConnectFailures 4 FreeMem 16296 40833073: NEXTION075 : Cmd Statement Line-1 Sent: main.t2.txt='15064' 40833310: NEXTION075 : Cmd Statement Line-1 Sent: main.t1.txt='17:40' 40833341: NEXTION075 : Cmd Statement Line-2 Sent: ec.t4.txt='1102' 40833376: NEXTION075 : Cmd Statement Line-3 Sent: main.t0.txt='25.1' 40833406: NEXTION075 : Cmd Statement Line-5 Sent: main.t3.txt='-50' 40833442: NEXTION075 : Cmd Statement Line-7 Sent: temp.t2.txt='25.0' 40833476: NEXTION075 : Cmd Statement Line-8 Sent: temp.t3.txt='36.3' 40833511: NEXTION075 : Cmd Statement Line-9 Sent: temp.t0.txt='25.1' 40833546: NEXTION075 : Cmd Statement Line-10 Sent: temp.t1.txt='23.4' 40841973: IMPT : [mqtt2#Voda1] : 25.06 40842272: IMPT : [mqtt2#Voda2] : 23.50 40842473: IMPT : [mqtt2#LED] : 36.31 40842673: IMPT : [mqtt2#Sump] : 25.00 40842868: IMPT : [mqtt1#PH] : 11.00 40843067: IMPT : [mqtt1#EC] : 1144.00 40843270: IMPT : [mqtt1#RSSI] : -50.00 40843467: IMPT : [mqtt1#RAM] : 15064.00 40848311: NEXTION075 : Cmd Statement Line-1 Sent: main.t1.txt='17:41' 40848340: NEXTION075 : Cmd Statement Line-2 Sent: ec.t4.txt='1144' 40848375: NEXTION075 : Cmd Statement Line-3 Sent: main.t0.txt='25.1' 40848405: NEXTION075 : Cmd Statement Line-5 Sent: main.t3.txt='-50' 40848440: NEXTION075 : Cmd Statement Line-7 Sent: temp.t2.txt='25.0' 40848476: NEXTION075 : Cmd Statement Line-8 Sent: temp.t3.txt='36.3' 40848511: NEXTION075 : Cmd Statement Line-9 Sent: temp.t0.txt='25.1' 40848546: NEXTION075 : Cmd Statement Line-10 Sent: temp.t1.txt='23.5' 40853072: NEXTION075 : Cmd Statement Line-1 Sent: main.t2.txt='15064' 40862059: WD : Uptime 681 ConnectFailures 4 FreeMem 16296 40863311: NEXTION075 : Cmd Statement Line-1 Sent: main.t1.txt='17:41' 40863341: NEXTION075 : Cmd Statement Line-2 Sent: ec.t4.txt='1144' 40863375: NEXTION075 : Cmd Statement Line-3 Sent: main.t0.txt='25.1' 40863406: NEXTION075 : Cmd Statement Line-5 Sent: main.t3.txt='-50' 40863442: NEXTION075 : Cmd Statement Line-7 Sent: temp.t2.txt='25.0' 40863476: NEXTION075 : Cmd Statement Line-8 Sent: temp.t3.txt='36.3' 40863513: NEXTION075 : Cmd Statement Line-9 Sent: temp.t0.txt='25.1' 40863548: NEXTION075 : Cmd Statement Line-10 Sent: temp.t1.txt='23.5' 40873073: NEXTION075 : Cmd Statement Line-1 Sent: main.t2.txt='15064'
So I would say that this is now definetly fixed.....
good job guys!!!!!
If you self compile, please state this and PLEASE try to ONLY REPORT ISSUES WITH OFFICIAL BUILDS!
Summarize of the problem/feature request
I use Arduino IDE 1.8.5, with #define PLUGIN_BUILD_TESTING.
Expected behavior
I expect the WeMos D1 R2 and mini to work fine.
Actual behavior
The ESPEasy reboots conitnuously with this message INIT : Booting version: (custom) (ESP82xx Core 2_4_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3) 88 : INIT : Warm boot #373 - Restart Reason: External System 92 : FS : Mounting... 117 : FS : Mount successful, used 77810 bytes of 957314 129 : CRC : No program memory checksum found. Check output of crc2.py 134 : CRC : SecuritySettings CRC ...OK 240 : INIT : Free RAM:26784 241 : INIT : I2C 241 : INIT : SPI not enabled 250 : NEXTION075 : serial pin config RX:13, TX:15, Plugin Enabled 250 : NEXTION075 : Using software serial
Exception (28): epc1=0x4020b58e epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000
ctx: cont sp: 3ffff830 end: 3fffffd0 offset: 01a0
ets Jan 8 2013,rst cause:2, boot mode:(3,6)
load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d vbb28d4a3 ~ld ⸮U87 :
INIT : Booting version: (custom) (ESP82xx Core 2_4_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3) 88 : INIT : Warm boot #374 - Restart Reason: Exception 91 : FS : Mounting... 116 : FS : Mount successful, used 77810 bytes of 957314 128 : CRC : No program memory checksum found. Check output of crc2.py 134 : CRC : SecuritySettings CRC ...OK 240 : INIT : Free RAM:26784 241 : INIT : I2C 241 : INIT : SPI not enabled 249 : NEXTION075 : serial pin config RX:13, TX:15, Plugin Enabled 250 : NEXTION075 : Using software serial
Exception (28): epc1=0x4020b58e epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000
ctx: cont sp: 3ffff830 end: 3fffffd0 offset: 01a0
ets Jan 8 2013,rst cause:2, boot mode:(3,6)
load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d vbb28d4a3 ~ld ⸮U88 :
INIT : Booting version: (custom) (ESP82xx Core 2_4_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.0.3) 88 : INIT : Warm boot #375 - Restart Reason: Exception 92 : FS : Mounting... 117 : FS : Mount successful, used 77810 bytes of 957314 129 : CRC : No program memory checksum found. Check output of crc2.py 134 : CRC : SecuritySettings CRC ...OK 241 : INIT : Free RAM:26784 241 : INIT : I2C 242 : INIT : SPI not enabled 250 : NEXTION075 : serial pin config RX:13, TX:15, Plugin Enabled 250 : NEXTION075 : Using software serial
Exception (28): epc1=0x4020b58e epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000003 depc=0x00000000
ctx: cont sp: 3ffff830 end: 3fffffd0 offset: 01a0
Steps to reproduce
I also tried the following.
And the error shows up.
Even if you do this:
So far so good.
And the error shows up again.
YesSystem configuration
Hardware: WeMos D1 mini and NodeMCU
ESP Easy version: ESPEasy_mega-20180908
ESP Easy settings/screenshots:
Rules or log data