homieiot / homie-esp8266

💡 ESP8266 framework for Homie, a lightweight MQTT convention for the IoT
http://homieiot.github.io/homie-esp8266
MIT License
1.36k stars 308 forks source link

OTA update make device reboot with old firmware #261

Closed elbowz closed 5 years ago

elbowz commented 7 years ago

I'm using latest commit (26dafb8251cd575b0cb87e641d8ed33945c40757) of v2.0.0 of homie and latest of homie-ota (e04712109fbd1f5a877e22fe1ab644dbda5ac03e).

When I try to make the OTA update by homie-ota ui, all seems works but after the firmware is uploaded I get an exception and the esp8266 reboot in normal mode with the old firmware.

This is the homie log captured by serial:

✴ OTA available (checksum eeffc1681ce672bf7644e093d987a0f0)
↕ OTA started
Triggering OTA_STARTED event...
Firmware is base64-encoded
Receiving OTA firmware (1054/354465)...
Receiving OTA firmware (2149/354465)...
[..]
Receiving OTA firmware (353169/354465)...
Receiving OTA firmware (354264/354465)...
Receiving OTA firmware (354464/354464)...

Exception (3):
epc1=0x4000bf64 epc2=0x00000000 epc3=0x00000000 excvaddr=0x4024c153 depc=0x00000000

ctx: sys 
sp: 3ffffad0 end: 3fffffb0 offset: 01a0

>>>stack>>>
3ffffc70:  3fff0d20 3fff1cdc 3ffffcac 4020188e  
3ffffc80:  4024c153 3fff0d20 3ffffcb0 40100688  
3ffffc90:  4024c1a2 000002f6 00000007 4020765d  
3ffffca0:  0007362c 3fff0d20 000001e4 3fff3764  
3ffffcb0:  0000000f 00000000 00000000 40201754  
3ffffcc0:  0007362c 3fff1cdc 3fff0d20 40207bb8  
3ffffcd0:  00000000 00000000 00000000 00000000  
3ffffce0:  3fff38cc 0000000f 0000000d 00000000  
3ffffcf0:  40242e16 3ffeac38 40242dc7 00000000  
3ffffd00:  ffffffff 00000000 3ffebca1 00000000  
3ffffd10:  40242e16 3ffeac38 3fff209c 00000000  
3ffffd20:  40000000 00073520 0000010c 3fff285a  
3ffffd30:  000000c8 0007362c 00000161 00000000  
3ffffd40:  3ffec47e 3fff344c 3ffffda0 00000000  
3ffffd50:  3ffec47e 3fff3444 00000000 4021b6b4  
3ffffd60:  0007362c 00000000 00000000 40206fab  
3ffffd70:  40000000 3fff3434 00000000 4021c0d2  
3ffffd80:  3fff245c 00000003 00002200 40214a84  
3ffffd90:  0007362c 00080000 7fffffff 00000002  
3ffffda0:  3fff2abc 00000003 40206f88 4021b68c  
3ffffdb0:  40000000 00000002 53002200 00000005  
3ffffdc0:  3fff3444 3fff2854 00000000 3fff39bc  
3ffffdd0:  3fff39bc 3fff34c0 3fff245c 00000159  
3ffffde0:  3ffffe50 0000010c 3fff3864 4021bf6d  
3ffffdf0:  0000010c 00073520 0007362c 00000000  
3ffffe00:  4022ee2a 3fff209c 3ffed132 40215a5e  
3ffffe10:  0000010c 00073520 0007362c 00000000  
3ffffe20:  3ffec47e 000005b4 3fff19cc 40214879  
3ffffe30:  3fff06e4 00000323 00000323 4010020c  
3ffffe40:  00000000 3ffe8a45 00000001 40100688  
3ffffe50:  00000000 00000000 00000000 40107594  
3ffffe60:  3fff1b25 00000006 3fff39bc 40226281  
3ffffe70:  3fff38cc 3fff0cdc 3fff5654 3fff1a20  
3ffffe80:  00000001 3fff394c 3fff19cc 402148ac  
3ffffe90:  00000001 3fff38cc 3fff19cc 4021ace4  
3ffffea0:  3fff34bc 00000000 40233f13 3fff20b4  
3ffffeb0:  3fff1ec0 3fff1ecc 00000006 3fff0cd0  
3ffffec0:  3fff34bc 3fff0cd8 3fff0cd4 4021ad37  
3ffffed0:  3fff34bc 3fff0cd8 3fff0cd4 4022cb0c  
3ffffee0:  000005b4 dd01a8c0 00000010 000005b4  
3ffffef0:  00000010 00000000 3ffed0e1 00000047  
3fffff00:  40220000 00000000 0000007d 3fff1ec4  
3fffff10:  3ffec456 3fff1ecc 3fff394c 40229ef9  
3fffff20:  3fff0bd8 3fff245c 3fff245c 402283a5  
3fffff30:  3fff0ba8 3fff0ba4 02e9c6bf 3fff245c  
3fffff40:  3ffec448 00000000 3fff394c 402292f1  
3fffff50:  6401a8c0 3ffeb984 02e9d741 00000000  
3fffff60:  4010075a 02e9d741 3ffe9a60 60000600  
3fffff70:  4021e174 3ffe9ecc 3ffe9a60 09533dc2  
3fffff80:  4021e19a 3fffdab0 00000000 3fff24bc  
3fffff90:  3fffdc80 00000000 3fff394c 4022207f  
3fffffa0:  40000f49 3fffdab0 3fffdab0 40000f49  
<<<stack<<<
␑ʯr�\)  #�ZTriggering NORMAL_MODE event...
🔌 Booting into normal mode 🔌
{} Stored configuration
  • Hardware device ID: 600194037e09
  • Device ID: kitchen-light
  • Boot mode: normal
  • Name: Temp Humidity Motion Luminance Sensor
[..]
  • OTA: 
    ◦ Enabled? yes
↕ Attempting to connect to Wi-Fi...
✔ Wi-Fi connected
Triggering WIFI_CONNECTED event...
↕ Attempting to connect to MQTT...
Sending initial information...
✔ MQTT ready
Triggering MQTT_CONNECTED event...
Calling setup function...
〽 Sending statistics...
  • Wi-Fi signal quality: 40%
  • Uptime: 4s

Thanks for all your help!

Before submitting your issue, make sure:

egon0 commented 7 years ago

Same here :|

epc1=0x4000bf64 epc2=0x00000000 epc3=0x00000000 excvaddr=0x4024a879 depc=0x00000000

Edit:

triggered the ota update a second time, this time it seems to have worked. BUT: it has rebooted to config mode again. seems the ota-update erased my config file on the spiffs-partition. :(

Edit 2:

First try seems to fail always. Second try worked and also the config is not deleted after upgrade.

Edit 3:

after some more tests it seems always to work on second try. first try the esp reboots with old firmware, after the second try it flashs the new firmware and also the config wasn't erased anymore. maybe i have flashed the config from inside the arduino ide, now it was uploaded via curl and it seems to keep the config.

marvinroger commented 7 years ago

Weird. This is a blocker before releasing v2.0.0. I'll take a look!

Sorry for the long response time, studies are my priority!

egon0 commented 7 years ago

i just flashed some homie-devices via ota in the last days. it never works on the first try, mostly i need 5 or 6 tries.

edit: funny side-effect: i have to reconfigure all devices that were configured on december via curl or via spiffs-upload. after configuring via webapp/android-app, the config stays over the ota-updates.

marvinroger commented 7 years ago

I cannot reproduce. Everything is working on my side, b64 or not...

marvinroger commented 7 years ago

What is your board? I only have NodeMCU v2s.

egon0 commented 7 years ago

I use 11 NodeMCU v2 boards at the moment. Sorry, i havn't had time to test with 5 or 6 Nodes. Maybe this weekend.

edit: i'm using https://github.com/jpmens/homie-ota for update-triggering.

pezinek commented 7 years ago

Hi,

My WeMos D1 mini is rebooting to old firmware after OTA as well. (checked out develop branch at commit d7739b5d9ca741c7f64fb1cfbe615eb2aac2eab1, my complete sketch is here: https://github.com/pezinek/ir_blaster triggering OTA via this shell script: https://github.com/pezinek/ir_blaster/blob/master/scripts/update_fw_ota.sh )

Edit: I'm not able to reproduce this issue anymore with latest homie-esp8266 (commit 1d5f228c3876f1e45ad0c96633c0ba1fb49c178a)

elbowz commented 7 years ago

I use ESP8266 12E/F ("raw" version) with 4Mb of flash memory...thanks!

pezinek commented 7 years ago

Actually I hit this issue again, and this time I was able to capture and decode the stack trace, and even though it doesn't make much sense to me, but maybe it would to someone else ;-)

Exception 3: LoadStoreError: Processor internal physical address or data error during load or store
Decoding 50 results
0x4024d8e2: chip_v6_unset_chanfreq at ?? line ?
0x4020182d: String::copy(char const*, unsigned int) at ?? line ?
0x402019a9: String::operator=(char const*) at ?? line ?
0x4024d8e2: chip_v6_unset_chanfreq at ?? line ?
0x4021b618: HomieInternals::Logger::write(unsigned char const*, unsigned int) at ?? line ?
0x402072f9: HomieInternals::BootNormal::_endOtaUpdate(bool, unsigned char) at ?? line ?
0x4024d526: chip_v6_unset_chanfreq at ?? line ?
0x402076e8: HomieInternals::BootNormal::_onMqttMessage(char*, char*, AsyncMqttClientMessageProperties, unsigned int, unsigned int, unsigned int) at ?? line ?
0x40105142: pp_post at ?? line ?
0x401049af: lmacRxDone at ?? line ?
0x40105e39: trc_NeedRTS at ?? line ?
0x401063f6: wDev_ProcessFiq at ?? line ?
0x40244593: pp_attach at ?? line ?
0x4021b504: std::_Function_handler    (HomieInternals::BootNormal*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>)> >::_M_invoke(std::_Any_data const&, char*, char*, AsyncMqttClientMessageProperties, unsigned int, unsigned int, unsigned int) at ?? line ?
0x40231d6c: ieee80211_output_pbuf at ?? line ?
0x40206c5b: std::_Function_base::_Base_manager    (HomieInternals::BootNormal*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) at ?? line ?
0x4021bf6a: std::function ::function(std::function  const&) at ?? line ?
0x4022a4ac: etharp_send_ip at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/netif/etharp.c line 434
0x402148c0: AsyncMqttClient::_onMessage(char*, char*, unsigned char, bool, bool, unsigned int, unsigned int, unsigned int, unsigned short) at ?? line ?
0x4010020c: _umm_free at umm_malloc.c line ?
0x40206c38: std::_Function_base::_Base_manager    (HomieInternals::BootNormal*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) at ?? line ?
0x4021b4dc: std::_Function_handler    (HomieInternals::BootNormal*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>)> >::_M_invoke(std::_Any_data const&, char*, char*, AsyncMqttClientMessageProperties, unsigned int, unsigned int, unsigned int) at ?? line ?
0x4022a791: etharp_output_to_arp_index at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/netif/etharp.c line 888
0x40206cba: std::_Function_base::_Base_manager    (HomieInternals::BootNormal*, std::_Placeholder1>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) at ?? line ?
0x4021be0d: std::_Function_handler    (AsyncMqttClient*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>, std::_Placeholder7>, std::_Placeholder8>, std::_Placeholder9>)> >::_M_invoke(std::_Any_data const&, char*, char*, unsigned char, bool, bool, unsigned int, unsigned int, unsigned int, unsigned short) at ?? line ?
0x40215866: AsyncMqttClientInternals::PublishPacket::parsePayload(char*, unsigned int, unsigned int*) at ?? line ?
0x402146b5: AsyncMqttClient::_onData(AsyncClient*, char*, unsigned int) at ?? line ?
0x402138cc: std::_Function_base::_Base_manager    (AsyncMqttClient*, std::_Placeholder1>, std::_Placeholder2>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) at ?? line ?
0x4021bd7c: std::_Function_handler    (AsyncMqttClient*, std::_Placeholder1>, std::_Placeholder2>)> >::_M_invoke(std::_Any_data const&, unsigned short, unsigned char) at ?? line ?
0x40213884: std::_Function_base::_Base_manager    (AsyncMqttClient*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>, std::_Placeholder7>, std::_Placeholder8>, std::_Placeholder9>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) at ?? line ?
0x4021bdd4: std::_Function_handler    (AsyncMqttClient*, std::_Placeholder1>, std::_Placeholder2>, std::_Placeholder3>, std::_Placeholder4>, std::_Placeholder5>, std::_Placeholder6>, std::_Placeholder7>, std::_Placeholder8>, std::_Placeholder9>)> >::_M_invoke(std::_Any_data const&, char*, char*, unsigned char, bool, bool, unsigned int, unsigned int, unsigned int, unsigned short) at ?? line ?
0x40103915: lmacProcessTXStartData at ?? line ?
0x4022b97c: ip_output_if at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/core/ipv4/ip.c line 628
0x40231039: ieee80211_parse_beacon at ?? line ?
0x402146e8: std::_Function_handler ::_M_invoke(std::_Any_data const&, void*, AsyncClient*, void*, unsigned int) at AsyncMqttClient.cpp line ?
0x4021ab3c: AsyncClient::_recv(tcp_pcb*, pbuf*, signed char) at ?? line ?
0x402356e7: ieee80211_parse_wmeparams at ?? line ?
0x4021ab8f: AsyncClient::_s_recv(void*, tcp_pcb*, pbuf*, signed char) at ?? line ?
0x4022e2e0: tcp_input at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/core/tcp_in.c line 394 (discriminator 1)
0x40107648: pvPortMalloc at ?? line ?
0x40100000: _stext at ?? line ?
0x4022b6d1: ip_input at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/core/ipv4/ip.c line 558
0x4022aac9: ethernet_input at /Users/igrokhotkov/espressif/arduino/tools/sdk/lwip/src/netif/etharp.c line 1377
0x40243f23: pp_tx_idle_timeout at ?? line ?
0x40243e96: pp_tx_idle_timeout at ?? line ?
0x40223857: user_uart_wait_tx_fifo_empty at ?? line ?
ThreeSixes commented 7 years ago

I'm having the same issue on a Itead Sonoff devices and Wion devices. With both base64 and binary uploads via MQTT.

euphi commented 7 years ago

I can confirm the same issue for me (ESP-12F).

Unfortunately I had no time for further investigations yet.

466 commented 7 years ago

Same here, ESP -12E As far s i can see, with all libs up-to-date, OTA is received, crashes right afterwards:

Receiving OTA firmware (355840/357600)...
Receiving OTA firmware (356935/357600)...
Receiving OTA firmware (357600/357600)...

Exception (3):
epc1=0x4000bf64 epc2=0x00000000 epc3=0x00000000 excvaddr=0x4024bc65 depc=0x00000000

ctx: sys 
sp: 3ffffad0 end: 3fffffb0 offset: 01a0

I am no coder, though i am happy to help with logs / debugging. Just let me know what to do :)

bertmelis commented 7 years ago

I've got this long time ago from the esp8266.com forum:

ESP8266 Arduino core uses two stacks: one for the SDK, and one for the sketch. ctx: sys means that exception happened while running on the SDK (system) stack, and ctx: cont is for the sketch stack.

So I understand the crash is coming from the SDK itself and not from the Arduino (hence also Homie) code.

As you say your libs are up-to-date, does this also count for the Arduino framework itself?

elescondite commented 7 years ago

[EDIT: I don't want my first comment to be a bug report. My first comment should be to thank you for a great library!]

Same here. ESP-01 this time. Have not had a single successful OTA update on any of 3 firmwares.

Exception (3): epc1=0x4000bf64 epc2=0x00000000 epc3=0x00000000 excvaddr=0x4024cb29 depc=0x00000000

ctx: sys sp: 3ffffad0 end: 3fffffb0 offset: 01a0

Library Dependency Graph
|-- <Homie> v2.0.0
|   |-- <ArduinoJson> v5.6.7
|   |-- <AsyncMqttClient> v0.8.1
|   |   |-- <ESPAsyncTCP> v1.0.1
|   |   |   |-- <ESP8266WiFi> v1.0
|   |-- <Bounce2> v2.1
|   |-- <ESP8266mDNS>
|   |   |-- <ESP8266WiFi> v1.0
|   |-- <Ticker> v1.0
|   |-- <ESP8266WiFi> v1.0
|   |-- <ESP8266HTTPClient> v1.1
|   |   |-- <ESP8266WiFi> v1.0
|   |-- <DNSServer> v1.1.0
|   |   |-- <ESP8266WiFi> v1.0
|   |-- <ESP8266WebServer> v1.0
|   |   |-- <ESP8266WiFi> v1.0

fuego:Dimmer jrs$ pio platform show espressif8266


Version: 1.3.1
Home: http://platformio.org/platforms/espressif8266
Repository: https://github.com/platformio/platform-espressif8266.git
Vendor: https://espressif.com/
License: Apache-2.0
Frameworks: arduino, esp8266-rtos-sdk, simba

Packages
--------

Package toolchain-xtensa
------------------------
Type: toolchain
Requirements: ~1.40802.0
Installed: Yes
Version: 1.40802.0
Original version: 4.8.2
Description: xtensa-gcc

Package tool-esptool
--------------------
Type: uploader
Requirements: ~1.409.0
Installed: Yes
Version: 1.409.0
Original version: 4.9
Description: esptool-ck

Package tool-mkspiffs
---------------------
Type: uploader
Requirements: ~1.102.0
Installed: Yes
Version: 1.102.0
Original version: 1.2
Description: Tool to build and unpack SPIFFS images

Package tool-espotapy
---------------------
Type: uploader
Requirements: ~1.0.0
Installed: Yes
Version: 1.0.0
Original version: None
Description: This script will push an OTA update to the ESP

Package framework-arduinoespressif8266
--------------------------------------
Type: framework
Requirements: ~1.20300.1
Installed: Yes
Version: 1.20300.1
Original version: 2.3.0
Description: Arduino Wiring-based Framework (ESP8266 Core)
stritti commented 5 years ago

Seems to be stale issue. Will be closed. Feel free to reopen if there are more information.