Enhance FLASH size detection (was HTTP update fails on custom hardware)

bhamon commented 7 years ago

Hardware

Hardware: Custom (ESP8266 + AT25SF321) Core Version: 2.3.0

Description

I work on a custom hardware (ESP8266 + AT25SF321). Everything works fine excepting the OTA HTTP update.

The problem comes from the core function EspClass::getFlashChipRealSize. What is the exact purpose of this piece of code? It heavily relies on the flash chip id but, of course, our flash chip doesn't comply to the implicit rule used to determine the chip size without asking to it directly.

I corrected the error simply by replacing the method content by a call to the reliable EspClass::getFlashChipSize.

Why is there two different implementations for the same purpose?

By searching for the EspClass::getFlashChipRealSize usage, I found out that it seems to be used only by the updater.

davisonja commented 7 years ago

Looking at the two routines (getFlashChipSize and getFlashChipRealSize) the former pulls the info from the binary image (some config info is embedded in start of the image by the build process) and the latter gets the info from the flash chip itself (so should reflect actual hardware).

From the Updater.cpp file it's comparing the configured value (in the incoming update) with the real value to ensure they match.

So two implementations for two purposes and the use case for each in one tidy package :)

J,

On Mon, Feb 6, 2017 at 8:46 AM, Benjamin Hamon notifications@github.com wrote:

Hardware

Hardware: Custom (ESP8266 + AT25SF321) Core Version: 2.3.0 Description

I work on a custom hardware (ESP8266 + AT25SF321). Everything works fine excepting the OTA HTTP update.

The problem comes from the core function EspClass::getFlashChipRealSize. What is the exact purpose of this piece of code? It heavily relies on the flash chip id but, of course, our flash chip doesn't comply to the implicit rule used to determine the chip size without asking to it directly.

I corrected the error simply by replacing the method content by a call to the reliable EspClass::getFlashChipSize.

Why is there two different implementations for the same purpose?

By searching for the EspClass::getFlashChipRealSize usage, I found out that it seems to be used only by the updater.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/esp8266/Arduino/issues/2933, or mute the thread https://github.com/notifications/unsubscribe-auth/AAN_A8rt2VAMq6ml2mBWmVh7yQoFdnawks5rZicRgaJpZM4L3mdc .

bhamon commented 7 years ago

My problem comes from the getFlashChipRealSize method. The chip size is not retrieved from the flash chip itself, it is deduced from the flash chip id. In my case, the value deduced from my flash chip id is 2 bytes, which doesn't mean anything.

The chip I use is JEDEC compliant (http://www.jedec.org/sites/default/files/docs/jesd68-01.pdf), but the standard doesn't seem to help much regarding the chip id itself (see http://www.microchip.com/forums/m866050.aspx).

The rule deduced to extract the chip size from the chip id in the Arduino ESP core seems to come from a coincidence between chip ids used in some ESP breakout cards. Or is there a formal rule about the chip id construction that I'm not aware of?

Wouldn't it be more accurate to compare the received bin size with a call to getFlashChipSize which returns the info from the in-memory binary image?

davisonja commented 7 years ago

I have

uint32_t EspClass::getFlashChipRealSize(void)
{
    return (1 << ((spi_flash_get_id() >> 16) & 0xFF));
}

where spi_flash_get_id() is from the ESP SDK. From a bit of searching I see suggestion that the calculation is specific to (at least) Winbond IDs so could easily fail if different flash is in use. Presumably it's either coincidence or design that this holds true for the boards people are using this Arduino ESP on. (Or there are previous issues that've been lodged - I didn't look :)

Assuming the existing binary is correct, comparing the incoming one to it would provide some safety checking, though I don't know that such an assumption is more reliable than assuming getFlashChipRealSize yields the correct result.

It feels like you really want to adjust the calculation to be correct for your chip. Asking google about "AT25SF321" it suggests that the ID includes a 'density code' which might be similar, but you're just as likely to find a lookup table for your own hardware easier.

Further investigation online suggests we're ultimately re-covering old ground; from a 2016 on esptool there's no reliable standard; you'll likely get away with relying on the configured value (in firmware image); people like having the option.

If you've got custom hardware you probably know the liklihood of the components changing and can best determine whether to hardcode something, rely on the build-time configured value, or add a lookup for your device IDs. Bigger picture, it might be worth having the Updater-real-flash-size-check optional, or rely on the existing firmware, which would let people who needed to, check.

bhamon commented 7 years ago

Presumably it's either coincidence or design that this holds true for the boards people are using this Arduino ESP on.

It may originally come from the assertion made in the comment of the EspClass::getFlashChipSizeByChipId method which inventories some flash chips. This method doesn't seem to be used anymore though.

though I don't know that such an assumption is more reliable than assuming getFlashChipRealSize yields the correct result.

At least it won't fail on some hardware (unless the build process varies between the two binary images).

It feels like you really want to adjust the calculation to be correct for your chip.

No, as I said, we already found a workaround (even if we now have to patch this file in the middle of our build system). I only wanted to point this out as it may break on other boards (for example, that's the case of the Sparkfun Thing for which the method returns 2 bytes).

Bigger picture, it might be worth having the Updater-real-flash-size-check optional, or rely on the existing firmware, which would let people who needed to, check.

Yes, that's a valid option. But to be complete I would suggest to add a warning over the EspClass::getFlashChipRealSize method as it may break on some platform. Also, a tiny explanation about the actual asumption made in the method could be wise.

I can make a pull request for this if that can help. But what do you propose to make the check optional? An optional boolean at the end of each update method seems a bit harsh. Also, as the method is used in the Updater.h/.cpp, we must propagate the correction there too.

davisonja commented 7 years ago

Presumably it's either coincidence or design that this holds true for the boards people are using this Arduino ESP on.

It may originally come from the assertion made in the comment of the EspClass::getFlashChipSizeByChipId method which inventories some flash chips. This method doesn't seem to be used anymore though.

I've not been involved long enough to know the reasoning behind that. The autodetect is certainly more future proofed, if only it worked all the time :)

though I don't know that such an assumption is more reliable than assuming getFlashChipRealSize yields the correct result.

At least it won't fail on some hardware (unless the build process varies between the two binary images).

Which it easily could, if you're in a more hobby environment; ultimately I think this is why the check should be optional. Sitting here with our respective hardware we can't be sure what will fit the end users needs.

It feels like you really want to adjust the calculation to be correct for your chip.

No, as I said, we already found a workaround (even if we now have to patch this file in the middle of our build system). I only wanted to point this out as it may break on other boards (for example, that's the case of the Sparkfun Thing for which the method returns 2 bytes).

I was thinking of amending the code in a manner which worked for both for your chip and existing ones so the overall number of supported scenarios increased.

From the Sparkfun Thing Schematic PDF it looks like they're using a similar flash chip. From the corresponding datasheet the 'density code' is different and thus probably lends itself to a calculated value. If I follow the scheme correctly it should be possible to have an idea of what to do based on the manufacturer ID part of the chip-id. Which should lead to a routine that can hardcoded it if needed (as in getFlashChipSizeByChipId) but can use the additional information where it's valid. I'd be keen to improve the chances random hardware works, and definitely hardware that we've actually got people using (as in, might actually get it tested on)

Bigger picture, it might be worth having the Updater-real-flash-size-check optional, or rely on the existing firmware, which would let people who needed to, check.

Yes, that's a valid option. But to be complete I would suggest to add a warning over the EspClass::getFlashChipRealSize method as it may break on some platform. Also, a tiny explanation about the actual asumption made in the method could be wise.

Definitely.

I can make a pull request for this if that can help. But what do you propose to make the check optional? An optional boolean at the end of each update method seems a bit harsh. Also, as the method is used in the Updater.h/.cpp, we must propagate the correction there too.

I was thinking of a change to Updater.cpp to allow the user to turn off the flash-size check; or perhaps that is, select between

Current behaviour (compare firmware to 'real')
check existing firmware details (to see if new firmware matches)
no check (so update regardless)

Controlled by a property of the Updater class...

@igrr tends to get tagged in these things :)

devyte commented 7 years ago

@bhamon is this issue still valid with latest git? Is it still valid with the SDK 2.1.0 branch which supposedly has support for bigger flash sizes (meaning that the relevant SDK code has been upgraded).

bhamon commented 7 years ago

@devyte Yes, the issue is still present. The culprit is the getFlashChipRealSize function which still returns the flash size based on a magic formula where the size is expected to be in the flash id. The density code present in the device ID seems to be manufacturer dependent. The SDK isn't concerned here.

As discussed with @davisonja the least we should do for now is to add a warning over this method.

For a larger support, an option in the updater could be added to check the flash size against one of the two implementations.

Ultimately, we could switch on the manufacturer id. But the manufacturer mapping list could be hard to maintain.

d-a-v commented 6 years ago

@bhamon

I was thinking of a change to Updater.cpp to allow the user to turn off the flash-size check; or perhaps that is, select between

Current behaviour (compare firmware to 'real')

check existing firmware details (to see if new firmware matches)

no check (so update regardless)

Controlled by a property of the Updater class...

If this issue is still relevant, does @davisonja 's proposal above would be acceptable ?

Also, can you check your flash chip with esptool.py ? If its autodetection code can be believed as valid, what do you think of relying on it in this core ?

devyte commented 5 years ago

This won't fit into 2.5.0. Pushing milestone back.

earlephilhower commented 5 years ago

From a recent conversation on gitter:

I do remember that there are 3 sizes: what you set in the IDE, what the sketch thinks, and what the flash chip says

At one point I proposed to screw the whole magic size mess and do a flash check one time on bootup to probe the flash by attempting to either read or write a byte at the highest possible address for each size, starting from biggest (16MB) and testing down to smallest (512KB)

you try at 16MB-1, if it fails you know the flash is not 16MB. Next you try at 8MB-1. If that fails you know it's not 8. Then 4MB-1. Then 2MB-1. Then 1MB-1. Finally 512KB-1.

You do the probe once, then store the result somewhere, and use it on subsequent bootups. Until next firmware update where the stored value could get erased.

igrr commented 5 years ago

you try at 16MB-1, if it fails you know the flash is not 16MB. Next you try at 8MB-1. If that fails you know it's not 8. Then 4MB-1. Then 2MB-1. Then 1MB-1. Finally 512KB-1.

I think it doesn't fail if you try to read outside of the chip size; the address wraps around the chip size. So the detection becomes a bit more involved: you need to have some "magic" value at address (chip size - 4) or address 0.

devyte commented 4 years ago

Notes:

Read value at 16MB-1, keep it in mem
Write some different random value (not 0xFF or 0x00)
Read the new value
Compare new value read vs. what was written.
If same, write succeeded, write original value from 1. back.
If not same, write failed => not 16MB, continue step 1 with 8MB-1.
Once a flash size is determined, save the info in flash.

Note, if there is a power failure during this, it could corrupt the flash contents.

devyte commented 4 years ago

While there is an idea here to pursue, it requires some extensive work and testing. Removing milestone.

esp8266 / Arduino

Enhance FLASH size detection (was HTTP update fails on custom hardware) #2933

Hardware

Description