xoseperez / espurna

Home automation firmware for ESP8266-based devices
http://tinkerman.cat
GNU General Public License v3.0
3k stars 636 forks source link

OTA Migrating to Tasmota bicks devices #1663

Open dominikandreas opened 5 years ago

dominikandreas commented 5 years ago

Bug description Using OTA to migrate to a Sonoff Tasmota minimal firmware bricks device

Steps to reproduce Upload sonoff tasmota firmare (tested using web-interface, telnet, with and without prior firmware reset).

Expected behavior Not bricking the device

Device information

[877379] [MAIN] ESPURNA 1.13.4 (2aa5c59)
[877380] [MAIN] xose.perez@gmail.com
[877381] [MAIN] http://tinkerman.cat

[877382] [MAIN] CPU chip ID: 0x49D786
[877385] [MAIN] CPU frequency: 80 MHz
[877387] [MAIN] SDK version: 1.5.3(aec24ac9)
[877392] [MAIN] Core version: 2.3.0
[877394] [MAIN] Core revision: 159542381
[877399] 
[877400] [MAIN] Flash chip ID: 0x1440A1
[877403] [MAIN] Flash speed: 40000000 Hz
[877408] [MAIN] Flash mode: DOUT
[877410] 
[877411] [MAIN] Flash size (CHIP)   :  1048576 bytes /  256 sectors (   0 to  255)
[877420] [MAIN] Flash size (SDK)    :  1048576 bytes /  256 sectors (   0 to  255)
[877424] [MAIN] Reserved            :     4096 bytes /    1 sectors (   0 to    0)
[877433] [MAIN] Firmware size       :   491520 bytes /  120 sectors (   1 to  120)
[877440] [MAIN] Max OTA size        :   532480 bytes /  130 sectors ( 121 to  250)
[877447] [MAIN] EEPROM size         :     4096 bytes /    1 sectors ( 251 to  251)
[877456] [MAIN] Reserved            :    16384 bytes /    4 sectors ( 252 to  255)
[877462] 
[877463] [MAIN] EEPROM sectors: 251, 250
[877465] [MAIN] EEPROM current: 250
[877468] 
[877470] [MAIN] EEPROM:  4096 bytes initially |  1523 bytes used (37%) |  2573 bytes free (62%)
[877479] [MAIN] Heap  : 35088 bytes initially | 19584 bytes used (55%) | 15504 bytes free (44%)
[877487] [MAIN] Stack :  4096 bytes initially |  1792 bytes used (43%) |  2304 bytes free (56%)
[877495] 
[877497] [MAIN] Boot version: 4
[877498] [MAIN] Boot mode: 1
[877499] [MAIN] Last reset reason: Power on
[877504] [MAIN] Last reset info: flag: 0
[877507] 
[877508] [MAIN] Board: BLITZWOLF_BWSHPX
[877512] [MAIN] Support: ALEXA API BROKER BUTTON DEBUG_SERIAL DEBUG_TELNET DEBUG_WEB DOMOTICZ HOMEASSISTANT LED MDNS_SERVER MQTT NTP SCHEDULER SENSOR TELNET TERMINAL THINGSPEAK WEB 
[877529] [MAIN] Sensors: HLW8012 
[877531] [MAIN] WebUI image: SENSOR
[877532] 
[877533] [MAIN] Firmware MD5: 7e5f63803fb604f9d863cb48a9bece18
[877539] [MAIN] Power: 3116 mV
[877544] 

Link to actual product (different brand): https://www.amazon.de/gp/product/B07HHFKWJJ Another device that I bricked this way was a Jinvoo Curtain switch

Tools used Windows Chrome Browser, Windows telnet via command line

Additional context A similar issue had been raised before and closed without further investigation: https://github.com/xoseperez/espurna/issues/993

If someone can test this on a device that can be easily reflashed, it would probably help trying to find the cause of this. In my case, the wifi smart plug can't be opened without breaking it. I've got 2 more of these that I would like to migrate, but would rather not brick them...

dominikandreas commented 5 years ago

I managed to break the device open, solder some connections and re-flash it. Now I'm reproducing the same behavior. I have so far investigated the following:

[000076] [MAIN] CPU chip ID: 0x10337E [000080] [MAIN] CPU frequency: 80 MHz [000083] [MAIN] SDK version: 1.5.3(aec24ac9) [000087] [MAIN] Core version: 2.3.0 [000090] [MAIN] Core revision: 159542381 [000093] [000095] [MAIN] Flash chip ID: 0x144068 [000098] [MAIN] Flash speed: 40000000 Hz [000101] [MAIN] Flash mode: DOUT [000104]

[000105] [MAIN] Flash size (CHIP) : 1048576 bytes / 256 sectors ( 0 to 255) [000112] [MAIN] Flash size (SDK) : 1048576 bytes / 256 sectors ( 0 to 255) [000120] [MAIN] Reserved : 4096 bytes / 1 sectors ( 0 to 0) [000127] [MAIN] Firmware size : 307312 bytes / 76 sectors ( 1 to 76) [000134] [MAIN] Max OTA size : 712704 bytes / 174 sectors ( 77 to 250) [000141] [MAIN] EEPROM size : 4096 bytes / 1 sectors ( 251 to 251) [000148] [MAIN] Reserved : 16384 bytes / 4 sectors ( 252 to 255) [000155] [000157] [MAIN] EEPROM sectors: 251, 250 [000160] [MAIN] EEPROM current: 251 [000163] [000164] [MAIN] EEPROM: 4096 bytes initially | 49 bytes used ( 1%) | 4047 bytes free (98%) [000173] [MAIN] Heap : 45368 bytes initially | 5280 bytes used (11%) | 40088 bytes free (88%) [000181] [MAIN] Stack : 4096 bytes initially | 768 bytes used (18%) | 3328 bytes free (81%) [000189] [000190] [MAIN] Boot version: 1 [000193] [MAIN] Boot mode: 1 [000195] [MAIN] Last reset reason: Exception [000199] [MAIN] Last reset info: Fatal exception:29 flag:2 (EXCEPTION) epc1:0x4000e1c3 epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000018 depc:0x00000000 [000212] [000213] [MAIN] Board: ESPRESSIF_ESPURNA_CORE [000217] [MAIN] Support: DEBUG_SERIAL DEBUG_TELNET LED TELNET TERMINAL [000223] [MAIN] WebUI image: SMALL [000226] [000367] [MAIN] Firmware MD5: dbe072c9583a0f1a5346c2d7f0319e1b [000367] [MAIN] Power: 3189 mV [000368] [MAIN] Power saving delay value: 1 ms [000368] [MAIN] WiFi Sleep Mode: MODEM [000371]

---8<-------

[000378] [TELNET] Listening on port 23 [000379] [RELAY] Retrieving mask: 1 [000380] [RELAY] Number of relays: 0 [000383] [LED] Number of leds: 0 [000463] [WIFI] Creating access point Exception (29): epc1=0x4000e1c3 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000018 depc=0x00000000

ctx: cont sp: 3ffeff50 end: 3fff02c0 offset: 01a0

stack>>> 3fff00f0: 402380a5 00000017 60000200 401075ac 3fff0100: 4022cec9 40234a3a 00000001 40216d1e 3fff0110: 0068e1bc 402283ef 3fff261c 3fff2f34 3fff0120: 4022d10a 3fff261c 3fff2f34 3fffdad0 3fff0130: 3ffef28c 3ffedf40 3ffedebc 3fff2f34 3fff0140: 3fffdad0 3fffdad0 00000001 3fffdad0

3fff0150: 00000018 00000064 10e350ce fffeffff 3fff0160: 3ff20a00 0000ffff 3fff0188 40228a72 3fff0170: 3ffedebc 3fff2f34 3fff2f34 3fffdad0 3fff0180: 10e350ce 3fff7e33 00000000 00000000 3fff0190: 00000000 00000000 00000000 00000000 3fff01a0: 00000000 00000000 00000000 00000000 3fff01b0: 00000000 00000000 00000000 40229bc0 3fff01c0: 3fff2f34 00000001 3ffef28c 40229c08 3fff01d0: 00000000 3ffedf40 3ffeda38 00000000 3fff01e0: 4021b049 00000002 0068e101 000000fd 3fff01f0: 4021b199 00000002 00000001 3fffdad0 3fff0200: 3fff2aac 4021b20e 00000002 00000001 3fff0210: 40208f39 3ffeefa0 00000001 3fffdad0 3fff0220: 40208fcc 00020b54 3ffeefa0 40208fee 3fff0230: 3fffdad0 3ffeefa0 3ffef074 4020dba7 3fff0240: 40104ac0 00020b54 3fff0310 00000000 3fff0250: 3ffeda60 3fff0310 3ffef2a0 3fff0310 3fff0260: 3fffdad0 3ffeef70 3ffef074 4020dd30 3fff0270: 402017e6 00000001 00000004 4020dd64 3fff0280: 3fffdad0 3ffeef70 3ffef1d0 4020494e 3fff0290: 00000000 3ffeef70 00000004 40202c78 3fff02a0: 3fffdad0 00000000 3ffef284 40211204 3fff02b0: feefeffe feefeffe 3ffef2a0 401006fc <<<stack<<<

ets Jan 8 2013,rst cause:1, boot mode:(3,7)

load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d v09826c6d


is there any reliable way to factory reset and flash a sonoff tasmota image?
mcspr commented 5 years ago

Have you tried skipping sonoff-minimal and just uploading full sonoff.bin? They serve the same purpose and will flash the same way, because both Tasmota and ESPurna minimal images use the same Updater API provided by the Arduino Core. It will write new .bin after the current firmware (free space), reboot and then bootloader will overwrite original firmware space with the new one and load it.

While we are using similar mechanism to store settings (EEPROM / same flash locations), no configuration is shared and Tasmota will overwrite our settings anyway on the first boot (ref: settings.ino, sonoff-minimal build will skip this part).

Also something that we probably should check for - you can't OTA right after serial flash. You need to reboot the board first.

dominikandreas commented 5 years ago

Just gave the full sonoff.bin a shot, also ended in a bootloop. Specifically, I did this:

Listening at baudrate 74880, I get the following output:

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x4010f000, len 1384, room 16
tail 8
chksum 0x2d
csum 0x2d
vbb28d4a3
~ld
Fatal exception 9(LoadStoreAlignmentCause):
epc1=0x4023e974, epc2=0x00000000, epc3=0x00000000, excvaddr=0xffffffff, depc=0x00000000

Exception (9):
epc1=0x4023e974 epc2=0x00000000 epc3=0x00000000 excvaddr=0xffffffff depc=0x00000000

ctx: sys
sp: 3fffe7f0 end: 3fffffb0 offset: 01a0

>>>stack>>>
3fffe990:  00000000 00000000 00000000 00000000
3fffe9a0:  00000000 00000000 00000000 00000000
3fffe9b0:  00000000 00000000 00000000 00000000
3fffe9c0:  00000000 00000000 00000000 00000000
3fffe9d0:  00000000 00000000 00000000 00000000
3fffe9e0:  00000000 00000000 00000000 00000000
3fffe9f0:  00000000 00000000 00000000 4023f931
3fffea00:  00000000 00000000 00000000 3ffef7ae
3fffea10:  3ffef78c 3ffea1e4 3ffea1c4 4023fa5d
3fffea20:  00000000 00000012 3ffeef9a 4023e96a
3fffea30:  00000000 00000000 00000000 4023e60e
3fffea40:  00000000 00000000 00000000 00000000
3fffea50:  00000000 00000000 00000000 00000000
3fffea60:  40260f7a 00000000 00000000 00000000
3fffea70:  3ffef78c 00000012 3ffeef9a 4023e290
3fffea80:  4025d3c5 4025d372 4010463c 4025d3c8
3fffea90:  00000000 400042db 401048fe 000000fd
3fffeaa0:  00000012 00000020 3fffff10 00000001
3fffeab0:  401048f8 4010476f 00000003 00000000
3fffeac0:  ffffffff ffffffff ffffff02 ffffffff
...

Erasing the flash and flashing the sonoff.bin directly works as expected.

dominikandreas commented 5 years ago

Is there anyone that can provide a hint on fixing this? This is a major issue for me as it means that flashing espurna will lock you in. In my case, device used can't easily be opened and re-flashed.

mcspr commented 5 years ago

I did just try to update SHP-2 v23 unit with 6.5.0 sonoff-basic.bin (2.4.2 Core) and current ESPurna 1.13.6-dev development version.

Please try to use binaries from https://github.com/mcspr/espurna-nightly-builder/releases if you are unable to build them yourself. The version 1.13.4 from the original post and 1.13.5 might suffer from instability on SHP, in general and especially when upgrading using web interface (ref: #1574, #1587 )

edit: Nightly builds seems to be broken ATM. Need to fix that. Earlier version (0523) should be fine to check

mcspr commented 5 years ago

Is this still an issue? afaik, we still know that sonoff-basic firmware will not properly handle settings. Is this still a problem with proper sonoff.bin? Or do we need to ask Tasmota guys about settings handling / issues with them?

dominikandreas commented 4 years ago

Yes, it still doesn't work. Just tried and bricked another device. I used the latest release of espurna (1.14.1) and tried to upgrade to tasmota-lite v8.1.0 using telnet ota.

As I was still having issues with espurna and home assistant (discovery doesn't work well - device disappears after restarting home assistant) I just gave this another go and will just buy a new device now. Breaking them open to reflash them is too much of a pain with this specific device.

mcspr commented 4 years ago

When you do open the device, can you download flash contents using esptool.py read_flash and attach them here?

https://github.com/espressif/esptool#read-flash-contents-read_flash $ esptool.py -p PORT read_flash 0 0x100000 flash_contents.bin I am really not sure which part of the process breaks, if we accept .bin file from our side without any issues and allow Core to proceed with OTA. Quick glance at -lite flavour, it does not seem to disable settings as previously discussed -minimal.

Is this limited to telnet & web OTA? Have you tried espota.py instead?

mcspr commented 4 years ago

I will try some time later today with 1MB board, but everything works as you described in previous examples:

load 0x4010f000, len 1384, room 16 tail 8 chksum 0x2d csum 0x2d v09826c6d @cp:0 ld

00:00:00 CFG: Use defaults 00:00:00 Project tasmota Tasmota Version 8.1.0(lite)-2_6_1 00:00:00 WIF: WifiManager active for 3 minutes 00:00:00 HTP: Web server active on tasmota-0601 with IP address 192.168.4.1

dominikandreas commented 4 years ago

The following now worked for me: factory reset, flash tasmota using telnet ota (without changing any settings in espurna prior to that). I think before I configured wifi, maybe that was the issue.

thanks for your help!

sfromis commented 4 years ago

FTR, never use tasmota-minimal.bin as the first flash of Tasmota. It depends on config data from previous Tasmota install, and will only work as an intermediary step going from one Tasmota binary to another. This two-step procedure is to enable binaries with a size over half the available flash space within 1M.

dominikandreas commented 4 years ago

Ah okay, that makes sense, thanks for the info!

inverse commented 2 years ago

Is the factory reset requires for this migration to work or is it safe to go from working espurna to tasmota gz install?

sfromis commented 2 years ago

You cannot use .gz binaries for first install of Tasmota. The expectation is that a successfully flashed Tasmota should detect that it needs to create a new configuration, basically a "factory reset".

inverse commented 2 years ago

You cannot use .gz binaries for first install of Tasmota. The expectation is that a successfully flashed Tasmota should detect that it needs to create a new configuration, basically a "factory reset".

Thanks - tried doing that but unfortunately the factory reset approach didn't work - thankfully I can flash via FTDI to recover the device.

What I did

bruno-walter commented 2 years ago

I have some H801 devices I'm trying to move to ESPHome. I had a spare one that had Espurna on it that flashed over flawlessly (by creating the image using ESPHome, downloading the BIN, and then flashing via the Espurna WebUI.) When I went to try a couple that are deployed around the house (and hard to reach) with the same method, They appear to have been bricked in a similar manner as described here. I think the initial test one that worked may have been an older version of Espurna. The problem ones are running 1.14.1 so I wonder if there is a bug in that version?

I will be able to recover them by removing them and flashing them directly but I'd rather not have to crawl into the attics and/or get behind soffit to do this for all of them. Has anyone found a reliable way to flash from Espurna to an alternate firmware? Maybe I should try downgrading?

dominikandreas commented 2 years ago

Have you tried doing a factory reset first before migrating to esphome?

sfromis commented 2 years ago

There is no direct support for OTA migration from one firmware project to another. It may work in "many" cases, but do test with a similar setup before assuming that you can config an already deployed device. And always be prepared for falling back on wired flashing.

mcspr commented 2 years ago

Migration could be helped, though. We workaround Tasmota 'magic number check' by injecting them ourselves :) Plus, after Tasmota -> ESPurna reboot sequence when we still have some 'known' data in RTC RAM, wiping SDK config for WiFi which turned out a common source of issues b/c SDK is pretty dumb and just crashes instead of doing some kind of recovery / factory reset on itself. https://github.com/xoseperez/espurna/blob/1169be25a55401d42e3b3a465dbb1dc9f8c2f1df/code/espurna/system.cpp#L708-L730 https://github.com/esphome/esphome/blob/2059283707fb0145dcf920d76e90afb6d80a20fb/esphome/components/esp8266/core.cpp#L38-L50 e.g. if while using ESPurna had somehow corrupted SDK sector, rebooting and trying to use softAP or sta will very likely crash

Also note that 1.14.1 and any other firmware built using Core 2.3.0, which btw Tasmota also used in older versions, has a bug in OTA size estimation that would allow you to wipe existing fw without actually upgrading. Make sure .bin actually fits, {.bin size} % 4096 == 0 (as noted at https://github.com/xoseperez/espurna/releases and in our docs, since we hit the same issue updating from 1.14.1)

bruno-walter commented 2 years ago

Thanks for the input. I found the OTA upgrade worked for me when I created just the basic device in ESPHome (i.e. a minimal device without adding any lights/switches etc.) and chose to download their modern rather than legacy BIN format. Maybe it was the BIN format that was the issue, or perhaps it was size related (but the Espurna Web UI didn't warn about the size and it sounds like that approach should.) Once I had the basic ESPHome flashed I was able to do further OTAs using ESPHome and get my H801s fully configured.