littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
4.9k stars 771 forks source link

corrupted dir pair error after many power losses when underlying mass memory is a SD card #936

Open alex31 opened 5 months ago

alex31 commented 5 months ago

I use littlefs over SD card, on a stm32 MCU with an RTOS (ChibiOS) It's working well when the board is poweroff while the FS is not in activity, but, after repetitively power off while writting on FS, mount fail on the MCU, and when the sdcard is plugged on a Linux PC, fuse lfs mount fails also with this error :

mount.lfs /dev/sde /tmp/lfs littlefs/lfs.c:1346:error: Corrupted dir pair at {0x10d68ac, 0x10d68ad} lfs_fuse.c:673:error: Invalid or incomplete multibyte or wide character

I must confess that I have made a bench test that power-on / power-off the board at random time, and it take some time (between 100 and 200 power-off) before the FS is not mountable, but it inevitably finish to become corrupted.

Is it a known problem with littlefs, or is it the internal moves done by the sdcard firmware that are responsible of the corruption ?

geky commented 5 months ago

Hi @alex31, thanks for creating an issue. This shouldn't happen, as far as I'm aware there's no known power-loss issues, but of course there's always the possibility of an unknown bug.

Is it a known problem with littlefs, or is it the internal moves done by the sdcard firmware that are responsible of the corruption ?

I would be a bit suspicious of the SD card. They are quite a bit more complex than raw NOR/NAND flash, so probably take a bit more encouragement to actual persist writes to disk. littlefs's bd sync function needs to send whatever is equivalent to sync to the SD card and wait for it to complete, or else the filesystem state can become corrupted.

I would also not put it past cheaper SD cards to ignore sync commands/ordering given 199% of them end up formatted with FAT anyways. It may be worth trying SD cards from different manufacturers to see if the problem is consistent.

alex31 commented 5 months ago

Thanks for the answer, I am refining my testbench to have more statistics with different brand of sdcard. I also ordered "integral security sd card", a brand that have a line of micro sd card where the firmware is written to be failsafe on power loss, at the depend of performance. I have implemented __lfs_sd_sync to send sync CMD to the sdcard, so problem should no be on that side, more on the internal SD card firmware that is not guaranteed to be failsafe IMHO.

geky commented 4 months ago

Actually I think this may be littlefs's fault, see https://github.com/littlefs-project/littlefs/pull/948. I think this may be the source of the failures you're seeing.

Sorry for posting an incorrect answer. I didn't realize there was an ordering issue in littlefs w.r.t. bd sync.

I guess this flew under the radar because most users are on NOR/NAND/devices with simple write orders. It's a bit surprising this wasn't causing more bug reports.