littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
5.13k stars 791 forks source link

Suboptimal behavior when updating file without changing size #852

Open rojer opened 1 year ago

rojer commented 1 year ago

Currently if a file consists of multiple blocks and one of them is modified, LFS will relocate that block as well as all of the following blocks, even if they have not been modified (this can happen if the file consists of records of fixed size). This results in number of writes scaling with file size, not with the size of updated data. This should be optimized to avoid relocating unmodified blocks.

muguou commented 1 year ago

I have the same question, when append data to an exist 20kB file ,even it’s just 1 byte ,LFS will erase many blocks .This consumes too many sectors, and it may not even be as efficient as directly deleting the original sector to update data。 I

geky commented 1 year ago

Hi @rojer, @muguou, thanks for creating an issue.

Yeah, this is a known issue, and something I'm currently working on. The CTZ skip-list is an inherently append-only data-structure, and random writes aren't performant.

The solution I'm working towards is to replace the CTZ skip-list with a more traditional B-tree data structure. It's a big change, so it will take some time to be ready.

The B-tree code itself is actually working at the moment, but it still needs the exact mapping to files, as well as testing and other considerations.

rojer commented 1 year ago

@geky thanks for confirming, and good to know that a solution is in the works. meanwhile, is it true that the most efficient write operation on existing files would be an append?

geky commented 1 year ago

Yes, append is very efficient, up until you hit bottlenecks with the block allocator.

Actually, at an average of $O(1)$ operations to append to CTZ skip-lists, they in theory outperform most filesystem's file data structures in terms of pure append. There may be some other niche use case for them in other filesystems, but they aren't the right solution for littlefs.

kyrreaa commented 11 months ago

I would love for an option to turn off the cloning and instead make a temporary copy of the affected block(s) and upon commit/sync copy those blocks back over the originals keeping unchanged blocks CTZ list intact. This does allow for power resets during this operation leaving a block in the potentially erased state cutting off the list completely though... One only looses one file, not the filesystem.

kyrreaa commented 11 months ago

Here's a thought: If creating a new block(s) to overwrite parts of a file is done in new block, then when closing or syncing file one could copy the parts not overwitten in file to last of new blocks and then submit a "plan" to metadata/log about the file, and the new block(s). Then start the replacing of the original blocks. If interrupted one can compare the contents of the planned overwrites in new blocks to the exiting file blocks to detect progress and continue. When finished the file is again valid and the metadata plan can be updated/invalidated and the temporary new blocks can be released. The filesystem is then still valid upon interruption although some "work" remains before the file affected can be read as the CTZ list would pass through blocks that are in the middle of being erased or updated. It would however speed up edits on large files by a lot. In fact, it may be prudent to only trigger this behavior on very large files.

geky commented 10 months ago

Turning off copy-on-write is certainly an interesting idea. Though you would be trading quite a bit (no powerloss-resilience, no wear-leveling). I'm not sure what the benefit of using littlefs over say FAT would be in that case...

I think what you're describing in your second comment is usually called "journaling" and is a completely valid solution used in many so-called journaling filesystems (ext3/4, NTFS, etc). littlefs can avoid centralized-wear problems by keeping journals per-file, so in theory it would work (though by sacrificing wear-leveling over files).

Losing wear-leveling is an issue, but there's also the issue of increased code cost for maintaining a journal. It sounds like all the logic of maintaining the journal and fixing things after powerloss would add quite a bit.

My current thinking is that relying on copy-on-write for powerloss-resilience is not the issue, just that the data structure chosen is poor. A tree does not have the dependency problem a skip-list does, so copy-on-write rewriting should only need to write to a few blocks.

kyrreaa commented 10 months ago

I am using battery-supply and monitoring battery levels and remaining charge with a fuel gauge chip giving me some fore-warning to powerloss. I shut down file access and close files before any power is cut. This does not save me from powerloss during product disassembly but that is acceptable in my use-case as opening to disconnect battery requires taking system out of production/used state which also closes all files. For other use-cases I agree that it may not be very useful.