littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
5.16k stars 793 forks source link

Will littfs support compression in the future? #672

Open xukaihub opened 2 years ago

geky commented 2 years ago

Interesting you ask this because I was looking into this quite recently.

In the general case, I don't think so. Compression requires significant (for a microcontroller) RAM, and this is sort of a dead end for littlefs. Though if you happen to come across any constant-RAM compression algorithms let me know.

Compression lets you trade off some RAM and processing power for lower disk usage, but for littlefs we usually want the tradeoff to go the other way. Flash storage is cheap, so if you can use more storage to decrease RAM consumption/performance, you can get a smaller/cheaper MCU.


That being said, I think it would be really interesting to look at pre-compressed littlefs images, That is, allowing you to compress littlefs offline, such as at compile-time, and then mount/decompress littlefs at run-time.

I think this would be very useful for read-only images such as those used in bootloaders, and focusing on decompression would be a much easier problem.

felipeLeast commented 2 years ago

@geky Take a look at lz4 compression, it has a constant RAM compression, and high performance for microcontrollers. Mbed has an example in their mbed cloud project:

https://github.com/PelionIoT/mbed-cloud-client/blob/master/update-client-hub/delta-tool-internal/source/lz4.c

BenBE commented 2 years ago

Another possible implementation for compression could be on the route of heatshrink, which was developed with low/constant memory consumption in mind; n particular for low-end MCUs. Compression somewhat depends on how much RAM you allocate for it, but with even low settings the used LZS implementation works quite well.

Remains the question: Which parts of LFS could best be compressed. While there's the possibility to compress simple images like squashfs does, there's also the possibility to allow for compression of attributes/metadata or inline file contents. Both with metadata and inline file contents things should be accessible with the usual API.

Also given you mentioned low RAM consumption: You thus likely should avoid compressing the actual directory structure; especially as this might cause quite a bit of additional processing required to find information when (compressed) directories have to be scanned repeatedly for information.

geky commented 1 month ago

Circling back on this a bit, there are some upcoming changes to the file structure that I think make compression quite a bit easier to implement, so it would be interesting to revisit this in the future.

@geky Take a look at lz4 compression, it has a constant RAM compression

Another possible implementation for compression could be on the route of heatshrink

Thanks @felipeLeast, @BenBE. I made a mistake in thinking there were no constant-RAM compression algorithms. Though my understanding is there is a performance cliff where having a too small amount of RAM ends up compressing poorly. That combined with needing to match encoder/decoder dictionaries creates some annoying questions around how much RAM you need for portability.

I think the most complicated part will be how the compression configuration (which algo, dict size, etc) is stored. Simply storing the configuration per-file and failing to read when an algo/RAM is not available is one option.


Also given you mentioned low RAM consumption: You thus likely should avoid compressing the actual directory structure; especially as this might cause quite a bit of additional processing required to find information when (compressed) directories have to be scanned repeatedly for information.

Hmmm, I've only been thinking of compressing file data, which is a much simpler problem. Compressing metadata would be quite a bit more complicated, though not impossible... Especially since metadata is all append-only logs... 🤔

This paragraph could probably be an entire research project.

Remains the question: Which parts of LFS could best be compressed. While there's the possibility to compress simple images like squashfs does, there's also the possibility to allow for compression of attributes/metadata or inline file contents. Both with metadata and inline file contents things should be accessible with the usual API.

Inline data is an interesting question, and will probably take some experimentation. But if you revert to uncompressed on failed compression I don't see a reason to not try compressing inline data. We usually think of inlined data as "small", but if you have a >=1 MiB block size, your inlined data is probably not "small".

For attributes it's a bit more questionable. My gut says the portability issues would not be worth the compression gains, especially since attributes are usually quite small, but my gut has been wrong before...


Interesting things to think about.