Closed ariel-miculas closed 6 months ago
It would be worth implementing zstd seekable compression, that way we wouldn't have to decompress the entire blob to serve one file from it, we could decompress only the blocks needed for that file.
Results with seekable zstd:
$ time fd -tf . /tmp/squash -x cat > /dev/null
fd -tf . /tmp/squash -x cat > /dev/null 8.04s user 2.72s system 449% cpu 2.393 total
$ time fd -tf . /tmp/erofs -x cat > /dev/null
fd -tf . /tmp/erofs -x cat > /dev/null 8.16s user 2.62s system 465% cpu 2.316 total
$ time fd -tf . /tmp/puzzle-uncompressed -x cat > /dev/null
fd -tf . /tmp/puzzle-uncompressed -x cat > /dev/null 7.88s user 2.43s system 398% cpu 2.590 total
$ time fd -tf . /tmp/puzzle-compressed -x cat > /dev/null
fd -tf . /tmp/puzzle-compressed -x cat > /dev/null 7.77s user 2.37s system 222% cpu 4.560 total
Comparison between squashfs, erofs, uncompressed puzzlefs, compressed puzzlefs and compressed puzzlefs with zstd seekable support with different compression frame sizes
I'm using an image called barehost
which is an Ubuntu distribution:
$ du -sh ~/work/cisco/test-puzzlefs/real_rootfs/barehost/rootfs
658M /home/amiculas/work/cisco/test-puzzlefs/real_rootfs/barehost/rootfs
# squashfs
mksquashfs real_rootfs/barehost/rootfs barehost.sqhs
# erofs
~/work/erofs-utils/mkfs/mkfs.erofs ~/work/cisco/test-puzzlefs/barehost.erofs ~/work/cisco/test-puzzlefs/real_rootfs/barehost/rootfs
# uncompressed puzzlefs
target/release/puzzlefs build ../test-puzzlefs/real_rootfs/barehost/rootfs/ /tmp/puzzlefs-image-uncompressed barehost │
# unseekable compressed puzzlefs
./master-puzzlefs build -c ../test-puzzlefs/real_rootfs/barehost/rootfs /tmp/puzzlefs-unseekable-image barehost
# seekable compressed puzzlefs
target/release/puzzlefs build -c ../test-puzzlefs/real_rootfs/barehost/rootfs /tmp/puzzlefs-seekable-image barehost
# squashfs
squashfuse_ll ~/work/cisco/test-puzzlefs/barehost.sqhs /tmp/squash
# erofs
~/work/erofs-utils/fuse/erofsfuse ~/work/cisco/test-puzzlefs/barehost.erofs /tmp/erofs
# uncompressed puzzlefs
target/release/puzzlefs mount /tmp/puzzlefs-image-uncompressed barehost /tmp/puzzle-uncompressed
# unseekable compressed puzzlefs
./master-puzzlefs mount /tmp/puzzlefs-unseekable-image barehost /tmp/puzzle-unseekable
# seekable compressed puzzlefs
target/release/puzzlefs mount /tmp/puzzlefs-seekable-image barehost /tmp/puzzle-seekable
erofsfuse on /tmp/erofs type fuse.erofsfuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
squashfuse_ll on /tmp/squash type fuse.squashfuse_ll (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
/dev/fuse on /tmp/puzzle-uncompressed type fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
/dev/fuse on /tmp/puzzle-unseekable type fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
/dev/fuse on /tmp/puzzle-seekable type fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/squash -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/squash -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 11.105 s ± 0.223 s [User: 6.798 s, System: 1.737 s]
Range (min … max): 10.607 s … 11.410 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/erofs -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/erofs -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 10.133 s ± 0.065 s [User: 6.612 s, System: 1.572 s]
Range (min … max): 9.971 s … 10.231 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-uncompressed -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-uncompressed -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 9.934 s ± 0.071 s [User: 6.581 s, System: 1.613 s]
Range (min … max): 9.850 s … 10.038 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-unseekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-unseekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 21.396 s ± 0.414 s [User: 6.771 s, System: 1.715 s]
Range (min … max): 20.615 s … 21.639 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 12.475 s ± 0.067 s [User: 6.733 s, System: 1.700 s]
Range (min … max): 12.410 s … 12.589 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 12.056 s ± 0.083 s [User: 6.700 s, System: 1.671 s]
Range (min … max): 11.941 s … 12.169 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 11.784 s ± 0.046 s [User: 6.692 s, System: 1.681 s]
Range (min … max): 11.678 s … 11.825 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 11.657 s ± 0.038 s [User: 6.676 s, System: 1.664 s]
Range (min … max): 11.616 s … 11.722 s 10 runs
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' "find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null"
Benchmark 1: find /tmp/puzzle-seekable -type f -exec cat {} \; > /dev/null
Time (mean ± σ): 11.662 s ± 0.076 s [User: 6.691 s, System: 1.668 s]
Range (min … max): 11.533 s … 11.818 s 10 runs
It seems 4KB is a good choice for the zstd frame size, considering the above results and also keeping in mind that the average chunk size produced by FastCDC with our current parameters is 80KB. Seekable compression reduces the mean reading time of the entire image from ~21.4s to ~11.8s, achieving similar performance to squashfuse (11.1s). This disregards any parallel operations on the filesystem. The image increases from 259MB for compression without seekable support to 289MB for compression with seekable support, for an image of size 658MB.
$ du -sh /tmp/puzzlefs-unseekable-image
259M /tmp/puzzlefs-unseekable-image
/tmp
$ du -sh /tmp/puzzlefs-seekable-image
289M /tmp/puzzlefs-seekable-image
Besides the overhead of the seekable frames, because each frame is compressed individually, the compression ratio probably goes down.
I took a root filesystem of 658M and I've built a squashfs image and two puzzlefs images, one compressed and one uncompressed:
I then mounted all three images (two puzzlefs images and a squashfs image):
Reading every file with fd:
This could be due to decompressing the same blob multiple times instead of caching the decompressed memory (squashfuse does readahead).