raburton / esp8266

Various bits of code for ESP8266
http://richard.burtons.org/
183 stars 47 forks source link

OTA leads to corrupt ROM #25

Closed morganrallen closed 9 years ago

morganrallen commented 9 years ago

This is most likely due to lack of understanding surrounding the linker scripts. These are the steps I've performed, it all seems to make sense but in practice the update fails, while reporting success.

I'm using an ESP8266-03 from Electrodragon with 512k of flash. I'm using the rboot-sampleproject as my starting point.

Modify rom1.ld

$ head rom1.ld 
/* This linker script generated from xt-genldscripts.tpp for LSP . */
/* Linker Script for ld -N */
MEMORY
{
  dport0_0_seg :                        org = 0x3FF00000, len = 0x10
  dram0_0_seg :                         org = 0x3FFE8000, len = 0x14000
  iram1_0_seg :                         org = 0x40100000, len = 0x8000
  irom0_0_seg :                         org = 0x40242010, len = 0x3C000
}

Changed irom0_0_seg in rom1.ld from 0x40282010 to 0x40242010

Build

$ make
Building esptool2 firmware tool
make[1]: Entering directory `/home/morgan/devel/ESP8266/raburton-esp8266/esptool2'
CC esptool2.c
gcc -O2 -Wall -c esptool2.c -o esptool2.o
CC esptool2_elf.c
gcc -O2 -Wall -c esptool2_elf.c -o esptool2_elf.o
LD esptool2
gcc -o esptool2 esptool2.o esptool2_elf.o
make[1]: Leaving directory `/home/morgan/devel/ESP8266/raburton-esp8266/esptool2'
Building rBoot boot loader
make[1]: Entering directory `/home/morgan/devel/ESP8266/raburton-esp8266/rboot'
mkdir -p build
mkdir -p firmware
CC rboot-stage2a.c
LD build/rboot-stage2a.elf
FW build/rboot-hex2a.h
CC rboot.c
LD build/rboot.elf
FW firmware/rboot.bin
make[1]: Leaving directory `/home/morgan/devel/ESP8266/raburton-esp8266/rboot'
Building rBoot sample project
make[1]: Entering directory `/home/morgan/devel/ESP8266/raburton-esp8266/rboot-sampleproject'
CC main.c
CC rboot-api.c
CC rboot-ota.c
CC uart.c
LD rom0.elf
FW rom0.bin
LD rom1.elf
FW rom1.bin
make[1]: Leaving directory `/home/morgan/devel/ESP8266/raburton-esp8266/rboot-sampleproject'

Nothing surprising there.

Flashing

$ esptool.py --port /dev/XBeeX write_flash -fs 4m 0x00000 rboot/firmware/rboot.bin 0x2000 rboot-sampleproject/firmware/rom0.bin 0x42000 rboot-sampleproject/firmware/rom1.bin

Connecting...
Erasing flash...
Wrote 3072 bytes at 0x00000000 in 0.3 seconds (82.9 kbit/s)...
Erasing flash...
Wrote 190464 bytes at 0x00002000 in 18.5 seconds (82.2 kbit/s)...
Erasing flash...
Wrote 190464 bytes at 0x00042000 in 18.5 seconds (82.3 kbit/s)...

Leaving...

Adjust -fs to 4m and second rom location to 0x42000, flashes without incident.

Connect, perform switch (twice), perform OTA

$ miniterm.py --rts=0 --dtr=0 /dev/XBeeX 115200
--- Miniterm on /dev/XBeeX: 115200,8,N,1 ---
--- Quit: Ctrl+]  |  Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
--- forcing DTR inactive
--- forcing RTS inactive
****CUT JUNK****                                                                                                                                                                �nn�n����|~�n�l`�rBC�C��Boot Sample Project

Currently running rom 0.
type "help" and press <enter> for help...
switch
Swapping from rom 0 to rom 1.
Restarting...

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x40100000, len 1476, room 16 
tail 4
chksum 0xfb
load 0x3ffe8000, len 672, room 4 
tail 12
chksum 0xf5
csum 0xf5

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Booting rom 1.
rBC�C��Boot Sample Project

Currently running rom 1.
type "help" and press <enter> for help...
switch
Swapping from rom 1 to rom 0.
Restarting...

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x40100000, len 1476, room 16 
tail 4
chksum 0xfb
load 0x3ffe8000, len 672, room 4 
tail 12
chksum 0xf5
csum 0xf5

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Booting rom 0.
rBC�C��Boot Sample Project

Currently running rom 0.
type "help" and press <enter> for help...
ota
Updating...
Firmware updated, rebooting to rom 1...

 ets Jan  8 2013,rst cause:4, boot mode:(3,7)

wdt reset
load 0x40100000, len 1476, room 16 
tail 4
chksum 0xfb
load 0x3ffe8000, len 672, room 4 
tail 12
chksum 0xf5
csum 0xf5

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Rom 1 is bad.
Booting rom 0.
rBC�C��Boot Sample Project

Currently running rom 0.
type "help" and press <enter> for help...

And that's where I end up, Rom 1 is bad.

Any insight as to what I've missed would be appreciated.

raburton commented 9 years ago

I'm looking at this on my phone, but it all looks spot on. So couple of suggestions to try: 1) try booting rom1 and doing an ota flash of rom0, does that work? 2) read back the bad rom using esptool.py and see if it matches the file. It could be that the webserver has sent an error page that hasn't been detected and that's been flashed or recently I saw an extra http header written at the start of the rom (sent by a defective web server).

morganrallen commented 9 years ago

Forgot to mention I had tried switching a flash both roms, same result.

raburton commented 9 years ago

Cool, see what's actually been written to the flash then. I suspect it's not what if should be, what's actually there is likely to be the clue as to what's going on. A bit of debug to print out number of bytes written is handy too, to see if it matches the file size. On 22 Sep 2015 10:29 pm, "morganrallen" notifications@github.com wrote:

Forgot to mention I had tried switching a flash both roms, same result.

— Reply to this email directly or view it on GitHub https://github.com/raburton/esp8266/issues/25#issuecomment-142428327.

morganrallen commented 9 years ago

Very curious results, not what I would have expected. Lots of null chunks throughout the dump, as if it's not being written properly.

https://gist.github.com/morganrallen/eb8f445963dc5ebd8cb1

morganrallen commented 9 years ago

And another OTA leads to a different dump

morganrallen commented 9 years ago

Also have now reproduces on two different ESP8266-03 devices.

morganrallen commented 9 years ago

Built against various SDKs with the same result v1.0.1 v1.2.0 v1.3.0 v1.4.0

morganrallen commented 9 years ago

Going to make a breakout to test a -12 later.

raburton commented 9 years ago

That is really very odd! Looking at your diff I notice a few interesting things. There doesn't seem to be any real pattern to the blank areas, I've found some as short as 64 bytes difference, which is likely to be much smaller than the packet arriving in the network recv. The breaks do not appear to align with sector boundaries. Probably the most interesting thing is that blank areas contain either 0xff of 0x00. If they were all 0xff it would suggest that the sector had been erased (erased bytes read as 0xff) and then the new data not written successfully (although as I say it doesn't look like whole chunks failed at a time). The 0x00 don't fit that though, these look like the sector has been erased then actually rewritten with 0x00! This makes me suspicious of the data being received. What web server are you using to serve the file? Hopefully a proper one like apache or iis (I've seen no end of people using dodgy ones that don't seem to work quite right, or in one case even a homemade one when they clearly didn't know anything about http!!!). Can you test with the files on a different server to rule that out?

raburton commented 9 years ago

Ok, I've tested with your rom here (took me a while to work out you'd changed the default port) and it's working fine:

 ets Jan  8 2013,rst cause:4, boot mode:(3,6)

wdt reset
load 0x40100000, len 1544, room 16 
tail 8
chksum 0xc1
load 0x3ffe8000, len 672, room 0 
tail 0
chksum 0xcf
csum 0xcf

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Booting rom 0.
{$S

rBoot Sample Project

Currently running rom 0.
type "help" and press <enter> for help...
help
available commands
  help - display this message
  ip - show current ip address
  connect - connect to wifi
  restart - restart the esp8266
  switch - switch to the other rom and reboot
  ota - perform ota update, switch rom and reboot
  info - show esp8266 info

switch
Swapping from rom 0 to rom 1.
Restarting...

 ets Jan  8 2013,rst cause:4, boot mode:(3,6)

wdt reset
load 0x40100000, len 1544, room 16 
tail 8
chksum 0xc1
load 0x3ffe8000, len 672, room 0 
tail 0
chksum 0xcf
csum 0xcf

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Booting rom 1.
{,

rBoot Sample Project

Currently running rom 1.
type "help" and press <enter> for help...
help
available commands
  help - display this message
  ip - show current ip address
  connect - connect to wifi
  restart - restart the esp8266
  switch - switch to the other rom and reboot
  ota - perform ota update, switch rom and reboot
  info - show esp8266 info
  derp, NOT IMPLEMENTED

connect
wifi connecting...
network retry, status: 1
ip: 192.168.4.165
ota
Updating...
Firmware updated, rebooting to rom 0...

 ets Jan  8 2013,rst cause:4, boot mode:(3,6)

wdt reset
load 0x40100000, len 1544, room 16 
tail 8
chksum 0xc1
load 0x3ffe8000, len 672, room 0 
tail 0
chksum 0xcf
csum 0xcf

rBoot v1.2.1 - richardaburton@gmail.com
Flash Size:   4 Mbit
Flash Mode:   QIO
Flash Speed:  40 MHz

Booting rom 0.
{$S

rBoot Sample Project

Currently running rom 0.
type "help" and press <enter> for help...

As you only provided rom1 I've had to do a combination of the my own copy of the sample project rom0 and your rom1. As you can see I start rom0 (mine), switch to rom1 (yours), use rom1 to ota update rom0 (mine again) and it then reboots into rom0 just fine. So there doesn't appear to be a problem with the compiled version of your code. So this leaves us with bad data from the webserver (seems most likely) or some odd hardware problem causing the writes to fail.

spants commented 9 years ago

As an aside, I hd a really weird problem (when using wifi) that turned out to be power related. Keep vcc and gnd short and make sure that the power is more than enough. A 10uF cap also helps.

raburton commented 9 years ago

@spants Actually that's a very good thought. I was thinking it was unlikely to be a hardware issue because it can be flashed over serial just fine, but low power (probably caused by wifi) while flashing sounds like a very plausible explanation for what @morganrallen is seeing.

morganrallen commented 9 years ago

More of the same. Should be noted I can curl/wget the image and it's identical. I've not tried 3-4 different HTTP servers with basically the same result (one failed to flash at all). Adding a cap to the power rails doesn't seem to have any effect.

@raburton what device type are you testing on? My next step I guess is to try a -12 as I suggested yesterday.

raburton commented 9 years ago

ESP12, but it shouldn't matter - the esp8266 is the same and the type of flash chip could vary within the same board number depending on which random Chinese factory made it. Plenty of other people have tested with other devices and this is the first time I've seen the issue you are having. I think @spants is probably onto the solution here, these devices can use quite a bit of power when pushed. What are you using to power the board and does it supply enough current? I think most people recommend a 1A supply.

morganrallen commented 9 years ago

Yeah, that could be it, looks like the old XBee Explorer is limited to a meager 150mA (ouch). I'll put a 3.3v LDO off the 5v line and see how that goes.

morganrallen commented 9 years ago

And there we have it. Very interesting result, in the past when I've encountered power related issues, it generally reboots the device. It appears if the power is just high enough, it can stay running but fail to write to flash. Thanks for the support and this awesome bootloader.

spants commented 9 years ago

Glad you sorted it. I thought that it sounded similar to the problems that I had....

raburton commented 9 years ago

Thanks @spants for spotting the problem, I'd have been scratching my head looking for a software cause for ages!

morganrallen commented 9 years ago

Yes, appreciate the reminder @spants, these parts behave very inconsistently when under powered.

fvpalha commented 9 years ago

Hi Richard.

After my first Flash, I could not switch rom.

Rom 1 is bad.
Booting rom 0.

So, I cleared the flash memory.

I added this to Makefile:

flashinit:
    $(vecho) "Flash init data:"
    $(vecho) "Clear old settings (EEP area):"
    $(vecho) "clear_eep.bin-------->0x79000"
    $(vecho) "Default config (Clear SDK settings):"
    $(vecho) "blank.bin-------->0x7E000"
    $(vecho) "esp_init_data_default.bin-------->0x7C000"
    $(ESPTOOL) -p $(ESPPORT) -b $(ESPBAUD) write_flash $(flashimageoptions) 0x79000 $(SDK_BASE)/bin/clear_eep.bin 0x7c000 $(SDK_BASE)/bin/esp_init_data_default.bin 0x7e000 $(SDK_BASE)/bin/blank.bin

Now I can switch the rom.