jkent / frogfs

Fast Read-Only General-purpose File System (ESP-IDF and ESP8266_RTOS_SDK compatible)
https://frogfs.readthedocs.io
Mozilla Public License 2.0
25 stars 32 forks source link

Add support for gzip transformation and destination file renaming #58

Closed X-Ryl669 closed 2 months ago

X-Ryl669 commented 3 months ago

This PR is a feature request implementation.

Currently, there are numerous features for compressing in frogfs but all of them imply decompressing when opening the files on ESP32.

This PR adds a way to compress a file (as a transform filter and not a compress), rename the compressed file (for example by appending .z to its name) and store it in compressed form but plain file in the partition.

The use case is when you have a HTTP server running on the ESP32, you don't want to decompress a text file to compress again when sending (that's what happen when you use compress filter). With this PR, since the compressed file is stored as plain text, you can access / mmap it (no copy, no decompress, no heap) and send directly (provided you set the Transfer-Encoding: gzip header in your answer). Since all HTTP clients support gzip encoding (mandatory in HTTP standard), it's a perfectly fine and standard operation.

In order to test it, you'll have to add a filter like this in your configuration file:

filter:
  '*.js':
    - gzip
    - rename:
        ext: z

Typically, in your HTTP / web server code on ESP32, when asked for file.js, you can simple search for file.js.z and if found skip compressing and sending the file. Else, you'll open the file.js and compress/send as usual.

Few remarks about the rename filter:

  1. The rename filter is a meta transform. It doesn't work on the content of the stream but on the destination name. This means that it can't access the content of the stream.
  2. This filter support 2 kind of arguments: ext that append this extension to original destination name and prepend that prepend the destination name with the argument's value. The former is useful to make a.js.z from a.js and the latter for making _a.js from a.js

Pro and cons

It might seems not much, but removing the dependency on heatshrink and deflate/zlib on FrogFS component reduce the firmware size of ~17kB (twice or 3x that amount if factory + 2 APP partitions). Since it's very unlikely that compressing is useful on a ESP32, and the need to decompress is useless on a ROM filesystem for files that can be decompressed on the final client, this saves on 2 birds at a time, the text file isn't consuming costly flash space for not being compressed.

Yet, if the file is required to be read by the ESP32, this is useless and one should balance if the compression saving overcome the zlib overhead.

jkent commented 3 months ago

Hi X-Ryl669,

I like the idea of the meta transform filter! However: I'm not sure if you noticed, but it is possible to identify and open files in raw, uncompressed format with the current implementation. This was actually part of the design.

An example of how you would do this with the raw API:

#include <assert.h>

#incldue "frogfs/frogfs.h"

frogfs_entry_t *entry;
frogfs_stat_t st;
frogfs_fs_t *fh;

entry = frogfs_get_entry(fs, "path/to/file");
assert(entry != NULL);
frogfs_stat(fs, entry, &st);
assert(st.compression == FROGFS_COMP_ALGO_DEFLATE);
fh = frogfs_open(fs, entry, FROGFS_OPEN_RAW);

/* fh is ready to be used in raw mode */

And here is an example of how you would do this with the VFS shim:

#include <assert.h>
#include <fcntl.h>
#include <sys/stat.h>

#include "frogfs/vfs.h"

int fd;
struct stat st;

fd = open("/frogfs/path/to/file", O_RDONLY);
assert(fd >= 0);
fstat(fd, &st);
assert(st.spare4[0] == FROGFS_MAGIC);
assert(st.spare4[1] == FROGFS_COMP_ALGO_DEFLATE);
fcntl(fd, F_REOPEN_RAW);

/* fd is rewound in raw mode and ready for reading */

Does this meet your needs?

X-Ryl669 commented 3 months ago

Well, not exactly.

Currently, to limit the amount of heap usage, I'm using the access (memory mapping) function. For me, it's the most important function of the FS. Using open as numerous drawbacks, IMHO: you need to create a loop to read buffer, send the buffer, check the number of bytes sent, refill the buffer and so on (and this is done at multiple layers: fs then lwip packets then wifi packets). With access, you don't need to do that, in the end, the code is smaller, faster and simpler.

We could also remove the test in the access function to check for RAW format (and let it accept COMPRESSED format too). But I'm not sure it's the right way.

Also, not using the compress filter allows to build frogfs without libzlib (for deflate) nor heatshrink. This saves 51kB of flash space since I don't care for compressing anything on the ESP32. I think it could have been possible to use the miniz implementation in ROM for decompressing too (but this would have meant modifying frogfs source code more deeply to make use of the limited implementation of ESP32 miniz's ROM). It might not seems much, but the smaller the binary, the faster it boots and this is really important for my projects that are running from a super capacitor.

As I understand it, transform filters are compile-time operations (so they are free from the ESP32's point of view) while compress filters are run-time operations (binary size cost + runtime cost).

jkent commented 2 months ago

Sorry I hadn't responded to this until now.

I see where you are coming from. I didn't define compression as strictly a runtime requirement, the original libespfs implementation actually allowed raw access to compressed data as well, but it relied strictly on matching filenames based on their extension (not too far off from what you're doing!)

You don't need zlib (or gzip) as a dependency to use a compression filter since that is all handled with mkfrogfs.py, but I see the vfs' open breaking without zlib being linked in since it opens the file right away. I overlooked this. What I don't like is that multiple filename lookups are performed -- the frogfs_entry_t is not reused.

I'm going to accept your solution, because access's behavior should be well-defined, and I like your way of thinking about transform vs compression filters. Thank you for your contribution!

X-Ryl669 commented 2 months ago

Thanks!