littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
4.9k stars 771 forks source link

Metadata can't be compacted below block_size/2 ? #940

Open kyrreaa opened 5 months ago

kyrreaa commented 5 months ago

Is there a reason why metadata can't be compacted below block_size/2 ? Would it not depend on the erase-size of a filesystems underlying storage? The reason I ask is I've experienced a very interesting issue where using the new (lfs 2.9) lfs_fs_gc() call to compact with compact_thresh set as low as possible (block_size/2) I experience situations where the metadata is just not "big enough" to compact, but during next use cycle I cross the compacting threshold (or it is full). This all depends on the number of files being created for each cycle of cleanup. This may also be exacerbated by the way I create files as I open+create the file, then close it again and re-open before starting to write. I do this to make sure the created file will exist in case the system is killed before the file is closed later. This allows the lost data situation to be detected and gives me something I can use to actually find the chain of datablocks belonging to the file later and recover it by overriding the file metadata live and reading to transfer off the device. Timing the creation test and graphing the progress shows how the compacting sometimes happens at very unfortunate times, and I was hoping to avoid this all together with the lfs_fs_gc() calls after bulk deletion of files that have been transfered. image image As you can see, using lfs_fs_gc() eliminates some of the compacting events but not all. It is all a matter of the number of files in a cycle and the previous cycles.

geky commented 4 months ago

Hmmm, this gets tricky.

The thing we really don't want to do is compact a metadata block that doesn't benefit from compaction (is only a single commit).

The block_size/2 limit is the threshold where we split a metadata block, so after compaction all metadata blocks should fit in block_size/2. Using this also as the lower-bound for lfs_fs_gc means that we are always guaranteed to make progress if we attempt a compaction.

Some options:

  1. Make the block_size/2 limit configurable (split_thresh?).

    This would let you tweak the worst-case compaction size. Though a lower split_thresh would result in more mdirs/lower mdir util.

  2. Explicitly count the number of commits (we currently don't track this), and allow compact_thresh < block_size/2 iff commits > 1.

    This would allow opportunistic compactions, but I'm not sure opportunistic compactions are that common when the metadata/block_size ratio isn't so high.

As a workaround, you could also try increasing block_size. The larger the block_size, the more space there is for the log to append to. Though I realize this may make other performance issues significantly worse...

This may also be exacerbated by the way I create files as I open+create the file, then close it again and re-open before starting to write.

You can call lfs_file_sync in this case. It won't save any progs/erases, but it will avoid an extra path lookup.

kyrreaa commented 4 months ago

I've removed the open/close/open as I found out it was created by someone before me in the project due to a bug they had elsewhere... (Did you know that some OS'es allow you to free a memory chunk allocated from a memory slab not part of the slab you are free-ing to? This leads to hilarious results, especially if a different thread later receives the chunk as a result of their allocation from the slab!) Now I just sync() after header to get some ID into the file (inline) in case I need to search for it in the raw blocks later.