Clarify underlying flash device behavior model

Hi @jimmo, excited to see littlefs adopted in MicroPython! This is interesting timing since I am also working on revamping the testing in littlefs to more aggressively find power-loss bug.

There are some power-loss tests in Mbed OS, which is where littlefs originally came from. I think relying only on these tests was a mistake as they're pretty intertwined into the Mbed's Green Tea testing framework, which is a bit difficult to get into and requires hardware. As such those tests have kinda fallen by the wayside and only get ran occasionally. https://github.com/ARMmbed/mbed-littlefs/tree/master/TESTS/filesystem_recovery

The test strategy I like, and will be porting to the littlefs core tests, is to have a reentrant test case and simulate a program reset at every prog/erase operation in the test. Though if you have any other suggestions I'm all ears.

MicroPython also allows arbitrary block devices, so we'd also like to understand what guarantees they need to provide in order to behave like how littlefs expects a flash device to work.

The key thing here is that littlefs has high expectations for the block device:

program should not touch other bits outside the program block, even temporarily
erase should not touch other bits outside the erase block, even temporarily

On the flip side, littlefs can work with a large range of geometries. So you should make sure the geometry matches what power-loss guarantees your device can provide.

NOR flash can write single bytes at a time, so prog_size = 1, block_size = erase_size is a good idea
SD cards can't write a single byte, they need to rewrite the whole page. They also don't really have an "erase", but in littlefs an "erase" is just a prep for a following program, so erase can be a noop in this case. prog_size = page_size, erase_size = prog_size generally works here.
NAND is interesting as it can't write single bytes, but it can write pages smaller than a full erase sector. prog_size = page_size, erase_size = sector_size generally works here.

It's also fine for block device operations to resolve out-of-order, for example, if you have is a cache or background-thread for disk operations. This is where the sync callback becomes important, as this is how littlefs tells the block device when it needs all operations to be realized on disk. Note that SD cards may have a cache in hardware that should be flushed on sync.

The approach littlefs takes to provide power-resilience is that whenever the user requests a littlefs operation, it either happens, or it doesn't. If littlefs returns, the state on disk reflects the new state. If a power-loss occurs, the state on disk is reflects either the new state, or the old state.

Conversely, an understanding of what circumstances littlefs can handle, and what is definitely out of the question.

Oh boy, lets see how much of this I can answer. Exactly what littlefs can handle gets a bit subtle.

what if the entire block is corrupted during a power-loss event while a small (sub-page) write was in progress.

littlefs assumes this can't happen. If it does, prog_size should be increased to include the full area a program operation can impact, even temporarily.

I've yet to see a storage device where this strategy doesn't work, but would be interested to know if there is one.

I don't expect that arbitrary block corruption should lead to a mountable filesystem (but should I expect that littlefs can always detect that corruption occurred?).

There's a number of different types of corruption that can occur. It can happen because of flash wear, bugs in the filesystem, and external users touching the storage. Here's some forms of corruption littlefs can/can't/should handle:

Write corruption This can happen if a flash block is bad or if there are issues with the storage bus.

littlefs double checks all programs to disk with a sync followed by a read. So if anything is corrupted at write-time, littlefs will consider the erase block as "bad" and move the data residing on the block to a different block. If all blocks are bad, littlefs will report LFS_ERR_NOSPC.

littlefs does not currently track bad blocks persistently, though this is a requested feature https://github.com/ARMmbed/littlefs/issues/316.

A note on wear Surprisingly, wear corruption doesn't only occur at write time. The physics of flash involves shoving a bunch of electrons behind a sort of insulated wall that breaks down over time. Even new parts can leak a few electrons here and there.

Wear manifests as a decrease in data retention as blocks are erased. Keep in mind it's a slow process, but it exists:

Graph from Toshiba's TC58 parts. I couldn't find any documents with actual units, but most NOR parts list ~20 years of data retention and ~100,000 erase cycles, so I think those are the bounds in that chart.

littlefs does NOT handle wear-corruption by itself. It relies on the block device reporting concerning wear as bad blocks. This can be implemented on a block device by adding ECC, and reporting bad blocks when the number of bit errors exceeds a concerning amount.

I did look into providing this sort of ECC monitoring directly in the filesystem. But it's not a good architectural match. It's much easier and more flexible to add ECC at the block device layer, and most NAND devices provide ECC in hardware.

It's also worth considering if you really need protection from wear-corruption. The wear-leveling in littlefs will delay wear-corruption issues for as many blocks have valuable erase cycles, so the device will likely be at its end-of-life when wear-corruption starts being a problem.
External corruption I would say this is any accidental corruption littlefs is not aware of until mount time. External users touching the block device, bugs in littlefs, loss of data retention. Anything that changes the filesystem to be unrecognizable.

The goal is to at least detect this sort of corruption, report LFS_ERR_CORRUPT, and halt. But littlefs is not there yet. It may hit an internal assert, or in the worst case go careening into erroneous behavior.

This is also some of the consideration behind revamping the test framework. Introduce some sort of fuzzing so we can build up better coverage for detecting corrupted filesystems.
Malicious corruption The concept of protecting littlefs against a "malicious user" has come up several times before (especially when talking about when littlefs should reformat a disk). But I'm of this opinion this is out-of-scope for littlefs.

At some point a malicious user can craft a near perfect littlefs image with tweaks that break either littlefs or some system at a higher level.

If you need this sort of protection, the best solution is to encrypt/auth each block in your block device with some sort of CMAC or HMAC. Because littlefs doesn't rely on the intermediary state of erases, this can be done quite easily with a bit of block device addressing math and setting prog_size = the size of your cipher block.

Hopefully that helps answer some questions.

littlefs-project / littlefs

Clarify underlying flash device behavior model #354