esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.04k stars 13.33k forks source link

ESP.restart() or auto-reboot after firmware update causes boot loop #7306

Closed CRCinAU closed 4 years ago

CRCinAU commented 4 years ago

Platform

Problem Description

When calling ESP.restart() or when rebooting after uploading firmware, the D1 Mini goes into a reboot loop. Output from the serial console below shows a successful boot on power on, then the first reboot after calling ESP.restart(), then the boot loop.

*WM: AutoConnect
*WM: Connecting as wifi client...
*WM: Status:
*WM: 6
*WM: Using last saved values, should be faster
*WM: Connection result: 
*WM: 3
*WM: IP Address:
*WM: 10.1.1.11
Autoupdate enabled at compile time...
*WM: freeing allocated params!
12991: Checking for update...
 - No Update Available.

13939: Finished auto-update check...

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

This continues until the reset button is hit on the device and a normal boot occurs.

EDIT: To add context, I did a pio update which pulled down the latest frameworks, then rebuilt all my existing projects and flashed them OTA. After this, all of the D1 Minis that were updated showed this problem.

Current versions:

pio update
Updating tool-scons                      @ 3.30102.0      [Up-to-date]

Platform Manager
================
Platform Espressif 8266
--------
Updating espressif8266                   @ 2.5.1          [Up-to-date]
Updating toolchain-xtensa                @ 2.40802.200502 [Up-to-date]
Updating framework-arduinoespressif8266  @ 3.20701.0      [Up-to-date]
Updating tool-esptool                    @ 1.413.0        [Up-to-date]
Updating tool-esptoolpy                  @ 1.20800.0      [Up-to-date]

Library Manager
===============
Library Storage: /home/netwiz/Documents/ESP8266/lib
Updating ArduinoJson                     @ 6.15.2         [Up-to-date]
Updating DHTStable                       @ 0.2.4          [Up-to-date]
Updating ESP8266-ping                    @ 2.0.1          [Up-to-date]
Updating FastLED_DMA                     @ 0.0.0          [Detached]
Updating IRremoteESP8266                 @ 2.7.6          [Up-to-date]
Updating PubSubClient                    @ 2.7            [Up-to-date]
Updating SimpleTimer                     @ b30890b8f7     [Up-to-date]
Updating SparkFun BME280                 @ 2.0.8          [Up-to-date]
Updating WifiManager                     @ 0.15.0         [Up-to-date]

However I use this in platformio.ini to pull in the latest framwork from here:

framework = arduino
platform = espressif8266
platform_packages =
    framework-arduinoespressif8266 @ https://github.com/esp8266/Arduino.git
CRCinAU commented 4 years ago

Annoyingly, if I set -DDEBUG_ESP_CORE, the problem goes away.... EDIT: Spoke too soon - I managed to capture this problem with DEBUG_ESP_CORE set.... EDIT2: And I turned logging off by mistake and missed it, now I can't reproduce :(

I am currently building with the following:

build_flags =
  -DDEBUG_ESP_PORT=Serial
;  -DDEBUG_ESP_SSL
;  -DDEBUG_ESP_TLS_MEM
;  -DDEBUG_ESP_HTTP_CLIENT
;  -DDEBUG_ESP_HTTP_SERVER
  -DDEBUG_ESP_CORE
;  -DDEBUG_ESP_WIFI
;  -DDEBUG_ESP_HTTP_UPDATE
;  -DDEBUG_ESP_UPDATER
;  -DDEBUG_ESP_OTA
;  -D PIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_IPV6_HIGHER_BANDWIDTH
;  -DNDEBUG

Normally, I would build with:

build_flags =
;  -DDEBUG_ESP_PORT=Serial
;  -DDEBUG_ESP_SSL
;  -DDEBUG_ESP_TLS_MEM
;  -DDEBUG_ESP_HTTP_CLIENT
;  -DDEBUG_ESP_HTTP_SERVER
;  -DDEBUG_ESP_CORE
;  -DDEBUG_ESP_WIFI
;  -DDEBUG_ESP_HTTP_UPDATE
;  -DDEBUG_ESP_UPDATER
;  -DDEBUG_ESP_OTA
;  -D PIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_IPV6_HIGHER_BANDWIDTH
  -DNDEBUG
CRCinAU commented 4 years ago

I'm still seeing this on two other D1 Mini's that are installed in non-easy to reach places - I can see the onboard LED blinking as it does a reset loop - and I can only easily get to the power to turn it off and on again... After I do this, the normal code launches fine - until I reboot or send a firmware to it. At that point, we enter the reboot loop again...

Any suggestions on this?

d-a-v commented 4 years ago

What are the step to reproduce? Is it simply doing an OTA then call ESP.restart() (with and without DEBUG_ESP_CORE) ?

CRCinAU commented 4 years ago

I have reproduced this with and without debug enabled. It happens on both ESP.restart() or after you upload a firmware via http.

Or even if the reset happens after downloading firmware via https and the unit does a restart after the firmware had been flashed.

On 16 May 2020 8:45:42 pm AEST, david gauchard notifications@github.com wrote:

What are the step to reproduce? Is it simply doing an OTA then call ESP.restart() (with and without DEBUG_ESP_CORE) ?

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/esp8266/Arduino/issues/7306#issuecomment-629626293

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

devyte commented 4 years ago

@CRCinAU you forgot an MCVE to reproduce. Please remember that it must not include 3rd party libs.

CRCinAU commented 4 years ago

Well, here's the annoying part - if I upload my 'basic web update' program that I use as a bootloader, I can flash that and reboot it as many times as I like... That code is at: https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/BasicWebUpdate/src/BasicWebUpdate.ino

If I flash the "OutdoorMonitor" code from the same git, then it fails every reboot - even if that's just to upload the new binary via the /update URL on the BasicWebUpdate flash. Source: https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/OutsideMonitor/src/GPIO_MQTT.ino

The "GarageDoor" code from here also fails to reboot: https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/GarageDoor/src/GarageDoor.ino

These were both working fine before a pio update :(

CRCinAU commented 4 years ago

To try and rule out any issue, I wiped the entire flash by using esptool erase_flash, then wrote a 4Mb blank file, then sent the code to the device again.

Connected properly to the HostAP for WifiManager, connected it to my wifi network fine, first reboot caused it to go into the boot loop again.

Have attached the confirmed failing binary in case that helps.

To confirm, the full source code for this binary is here: https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/OutsideMonitor/src/GPIO_MQTT.ino

It seems that the call to ESP.restart() works, but after that point, the never ending boot loop until you powercycle the device or reflash it over USB...

Edit: removed attachment.

devyte commented 4 years ago

Please don't attach binary files. So: the basic web update example works fine, but the sketch with your code and 3rd party libs doesn't?

CRCinAU commented 4 years ago

Correct. If I flashed the basic web updater, it can reboot via the /reboot web address, and via the /update web address after a firmware load happens. I couldn't make this go into the boot loop.

As soon as I sent the other binary to the /update URL, it went into a boot loop.

devyte commented 4 years ago

Then something in your code or in the 3rd party libs is causing the problem. If you're in a reboot loop before even setup is called, I suggest looking at the constructors of globally instanced objects, e .g. a global object constructor should not access other global object instances, because the order of construction of global objects is not deterministic between translation units. Or something else in the constructors. The only other thing that comes to mind is that there's something wrong with your binary and/or eboot, as built by your build system. I suggest rebuilding from the Arduino IDE. Closing due to not a core issue. If you do reduce the problem to a MCVE that uses only core code, please open a new issue, follow the template instructions, add your code and details, and reference this issue.

CRCinAU commented 4 years ago

Hmmm - at the moment, the only thing I can see is the upgrades done via pio update which are:

Updating espressif8266                   @ 2.4.0          [2.5.1]
Uninstalling espressif8266 @ 2.4.0:     [OK]
PlatformManager: Installing espressif8266 @ 2.5.1
espressif8266 @ 2.5.1 has been successfully installed!
Updating toolchain-xtensa                @ 2.40802.191122 [2.40802.200502]
Uninstalling toolchain-xtensa @ 2.40802.191122:     [OK]
PackageManager: Installing toolchain-xtensa @ 2.40802.200502

The use of the custom framework path to this git should cause the espressif8266 part to be ignored - leaving only the toolchain-xtensa as a possible problem?

As I have a version number on this, I'll try to downgrade it somehow and see what happens...

CRCinAU commented 4 years ago

Ok - I've hit something that I can reproduce....

If I comment out the following lines in my platformio.ini, then the units reboot correctly (after the first boot with default speeds):

;board_build.f_cpu = 160000000L
;board_build.f_flash = 80000000L

I note that I've been using these lines in my platformio.ini for a long time without issue - and even the BasicWebUpdate shown earlier uses these - but it works correctly...

@devyte - Does this ring any bells as to why this would suddenly start causing issues?

devyte commented 4 years ago

Not really. What happens if you build from the Arduino IDE and choose those params?

CRCinAU commented 4 years ago

I don't have the Arduino IDE installed, so I'll have to look at doing that from scratch...

On a similar topic, if I call system_update_cpu_freq(160); in setup(), shouldn't this set the CPU speed to 160Mhz? I can see that even if I call this, ESP.getCpuFreqMHz() still returns 80. I'm not sure if this is a problem - or if ESP.getCpuFreqMHz() only returns the boot speed?

Right now, I'm trying to either prove or disprove that compiling with the code set to switch to 160Mhz is the culprit in causing the boot loop...

devyte commented 4 years ago

The cpu freq changes whether you build with 80 or 160. It depends on several things. I'm not sure, but I don't think that changing it should cause a crash. However, a wrong flash speed can, as can a wrong flash mode.

CRCinAU commented 4 years ago

Here's my current data set:

Build at 80 Mhz (ESP.getCpuFreqMHz() shows 80): Flash -> up ok -> reboot -> up ok

Build at 160Mhz (ESP.getCpuFreqMHz() shows 160): Flash -> up ok -> reboot -> up ok -> reboot -> up ok -> flash 160Mhz build -> no boot. Power cycle -> up ok -> reboot -> no boot Power cycle -> up ok -> Flash 80 Mhz build -> no boot Power cycle -> up ok -> reboot -> up ok -> reboot -> up ok Flash 80Mhz build -> up ok -> reboot -> up ok

rrelande commented 1 year ago

very interesting as I have a similar issue and was not able to reproduce it reliably nor find path to investigation.

rrelande commented 1 year ago

OTA wih 80 Mhz CPU and 40 Mhz flash is ok - however I cannot explain why