project-machine / puzzlefs

Apache License 2.0
381 stars 18 forks source link

implement compression #14

Closed tych0 closed 1 year ago

tych0 commented 3 years ago

Right now all the metadata and file contents for puzzlefs are uncompressed. We should compress them.

The design format of puzzlefs constrains the use of compression a bit though: we need to be able to read data from an arbitrary offset (the compression community calls this a "seekable compression format").

People like zstd as a compression format, and it has out-of-tree implementations of this with rust bindings:

https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md https://crates.io/crates/zstd-seekable

which may be interesting.

tych0 commented 3 years ago

We have some basic support for this now with d52efd2b805885e35e089ae4b33177dff13b0b93; let's see how it goes.

ariel-miculas commented 1 year ago

from squashfs

For fast random access, compressed files are split up in fixed size blocks that are compressed separately. The block size can be set between 4k and 1M (default for squashfs-tools and squashfs-tools-ng is 128K).

from zstd_seekable

This is done by splitting up the input data into frames, each of which are compressed independently, and so can be decompressed independently. Decompression then takes advantage of a provided 'seek table', which allows the decompressor to immediately jump to the desired data.

Depending on the chunking parameters we choose, the seekable compression may not be worth it. For example, let's say we pick 16KB for the average chunk and 64KB for the maximum chunk. Then the maximum chunk size is less than the squashfs default block size of 128KB. Also, zstd seekable compression includes extra metadata for storing information about the offsets, resulting in a size increase for the compressed chunks. Plus we'll be stuck with zstd.

hallyn commented 1 year ago

Could you post your deduplication performance results, or maybe just a concise summary?

IIUC, tuning the parameters made it so that indeed compression may not be needed.

ariel-miculas commented 1 year ago

I still think we need compression, I'm only arguing against seekable compression.

hallyn commented 1 year ago

Sounds good.

ariel-miculas commented 1 year ago

An interesting point from squashfs is that they're not compressing blocks if this would result in a larger size, in this case they're storing the block uncompressed, perhaps we should also implement this. Or even this:

By checking compression ratios and storing compressed only those chunks whose compression ratios meet a certain threshold, the amount of decompression involved during data access can be reduced ... This allows for capturing most of the savings of chunk compression while eliminating the cost of decompression on the majority of chunks.

from the microsoft paper

ariel-miculas commented 1 year ago

Related: https://github.com/containers/image/pull/1084

ariel-miculas commented 1 year ago

We can have optional support for compression and we need a new field in BlobRef to indicate whether the blob is compressed or not. The blob still needs to be content-addressed, so the sha256sum of the compressed blob needs to be computed.

ariel-miculas commented 1 year ago

We should also process the files in parallel during puzzlefs build in order to speed up the compression.

ariel-miculas commented 1 year ago

next steps:

ariel-miculas commented 1 year ago

Setup:

Build and mount the filesystem:

target/release/puzzlefs build ../test-puzzlefs/real_rootfs/barehost/rootfs /tmp/oci-barehost test
target/release/puzzlefs mount -f /tmp/oci-barehost test /tmp/puzzle

Reading an entire filesystem:

Compression disabled

$ time fd -tf -x cat > /dev/null
fd -tf -x cat > /dev/null  9.64s user 2.54s system 280% cpu 4.338 total
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' 'fd -tf -x cat > /dev/null'

Benchmark 1: fd -tf -x cat > /dev/null
  Time (mean ± σ):      6.233 s ±  0.455 s    [User: 12.785 s, System: 4.290 s]
  Range (min … max):    5.503 s …  7.020 s    10 runs

Compression enabled

$ time fd -tf -x cat > /dev/null
fd -tf -x cat > /dev/null  8.59s user 2.13s system 57% cpu 18.549 total
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' 'fd -tf -x cat > /dev/null'

Benchmark 1: fd -tf -x cat > /dev/null
  Time (mean ± σ):     20.681 s ±  1.989 s    [User: 10.174 s, System: 2.866 s]
  Range (min … max):   18.057 s … 24.472 s    10 runs

Comparison with squashfuse

squashfuse -f barehost.sqhs /tmp/squash
$ time fd -tf -x cat > /dev/null
fd -tf -x cat > /dev/null  11.50s user 3.83s system 96% cpu 15.816 total
$ hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' 'fd -tf -x cat > /dev/null'

Benchmark 1: fd -tf -x cat > /dev/null
  Time (mean ± σ):     13.134 s ±  1.194 s    [User: 10.361 s, System: 2.903 s]
  Range (min … max):   11.355 s … 16.096 s    10 runs