apache / nuttx

Apache NuttX is a mature, real-time embedded operating system (RTOS)
https://nuttx.apache.org/
Apache License 2.0
2.89k stars 1.18k forks source link

SAMv7 Progmem driver byte write lead to ECC errors and flash data corruption #9446

Open pkarashchenko opened 1 year ago

pkarashchenko commented 1 year ago

SAMv7 progmem (internal flash) driver does not perform byte writes correctly. The driver assumes that it is possible to read page content, change erased (0xFF) bytes value and write page back to flash. That is usually a case for most of standard flashes, but SAMv7 internal flash has built-in ECC support, so applying such procedure destroys the flash content and rise "Multiple ECC Error" bit in EEFC_FSR register. The partial page programming of internal flash is of course possible, but with some restrictions specified in datasheet.

Screenshot 2023-05-31 at 21 58 42

So only the first partial programming after erase is forking fine. I found this and debugged for a while trying to place MTD config data on top of internal flash page. Just logging it here as I do not plan to do anything with it, but this information may save investigation time for someone else.

acassis commented 1 year ago

@pkarashchenko is this related with similar flash ECC error you were fighting some time ago?

pkarashchenko commented 1 year ago

Yes, but at that time I didn't know it were ECC errors. It was reproduced like a sporadic bit flip that I initially thought was related to flash wait cycles configuration. I tried to fix it in https://github.com/apache/nuttx/pull/6295, but finally found out that it is ECC errors. The current driver even can't detect it since flash write operation ends up without issues as ECC error bits are set only after reading data from flash :) Also ECC error bits are not checked anywhere. Anyway flash wait cycles configuration is a bit messed up in current code base, so I think I will reopen https://github.com/apache/nuttx/pull/6295, but remove any "bit flip" referenced from it.

acassis commented 1 year ago

Thank you for explanation! Maybe it could be useful for @michallenc @adamfeuer and @TimJTi since they are using SAMv7 chips.

TimJTi commented 1 year ago

No internal flash on SAMA5D2 but thanks for heads-up!

PetervdPerk-NXP commented 1 year ago

The S32K3XX also has ECC flash, doing back-to-back writes would invalidate the ECC value. However on the S32K3XX however we can disable the ECC for the special data region so we can do back-to-back on specific region and implement a filesystem like LittleFS. https://github.com/apache/nuttx/blob/23d42632079cc8e5671b7cdceba874091172bebc/arch/arm/src/s32k3xx/s32k3xx_progmem.c#L430-L432

pkarashchenko commented 1 year ago

I think turning off the ECC does not help here as seems like flash has a "NAND nature", so bits are getting corrupted on write, so reading back data do not match with data just written. Only full sector read-erase-write sequence works reliably, but progmem interface assumes a "NOR nature" of the underlying flash.

mu578 commented 1 year ago

yes there is no execute in place capabilities, you need two kind of strategies NAND volatile and soon be gone NOR flash memories management ; trying to generalize both in a same unit doesn't live well together.