project-machine / puzzlefs

Apache License 2.0
381 stars 18 forks source link

pick good chunking parameters #13

Closed tych0 closed 1 year ago

tych0 commented 3 years ago

Right now, our chunking parameters for content defined chunking are:

// 'ubuntu' base image is ~40M, as are other base images. If we have any hope of wanting to share
// these, we should allow small chunks.
const MIN_CHUNK_SIZE: usize = 10 * 1024 * 1024;
const AVG_CHUNK_SIZE: usize = 40 * 1024 * 1024;
const MAX_CHUNK_SIZE: usize = 256 * 1024 * 1024;

and that comment is the sum total of the analysis I did (aka basically nothing).

It would be good to experimentally determine what parameters are best by testing a bunch of different things on known images so that we can determine sharing. We have large backlog of images (e.g. multiple versions of the same image, or two totally unrelated images) that we could test on both publicly on the docker hub and perhaps more interesting private images of enterprise code. Some more analysis of this is warranted.

ariel-miculas commented 1 year ago

"Here all the CDC approaches are configured with the maximum and minimum chunk sizes of 8x and 1/4x of the expected chunk size, the same as configured in LBFS" - from the fastcdc paper

hallyn commented 1 year ago

I did some experiments with the 6 versions of a distro-sized (using a layer we call 'barehost' here) layer. I essentially did:

for tag in 10.25 10.26 10.27 10.28 10.29 10.30; do
   mkdir -p $tag/oci
   skopeo copy docker://$REPO//$BASEDIR/barehost:$tag \
       oci:$tag/oci:barehost
done
for tag in 10.25 10.26 10.27 10.28 10.29 10.30; do
    pushd $tag
    umoci unpack --rootless --keep-dirlinks --image oci:barehost rfs
    puzzlefs build rfs/rootfs oci2 squashfs
    popd
done

Then I collected information about the common blobs using:

#!/bin/bash

declare -A list
declare -A sizes
for d in 10.25  10.26  10.27  10.28  10.29  10.30; do
    pushd $d/oci2/blobs/sha256
    for f in *; do
        if [ -z ${list[$f]} ]; then
            list[$f]=1
        else
            list[$f]=$(( list[$f] + 1 ))
        fi
        sizes[$f]="$(stat -c %s $f)"
    done
    popd
done
for key in "${!list[@]}"; do
    echo "$key size ${sizes[$key]} showed up ${list[$key]} times"
done

In the end, my summary was that we have::

which suggests we’ve saved (240 5 + 96 3 + 264 * 2 + 299 = 2.3G) in the total representation

However, we do have to offset that with the savings of almost half for each oci layout due to OCIv1 compressing, while puzzlefs does not. So we end up with about 500M per OCIv1 layout and about 900M per puzzlefs layout.

If we were simply mashing all 6 puzzlefs images together into one oci directory (which takes advantage of the deduplication) then OCIv1 for all 6 layouts would take 2.8G, while for puzzlefs would take 3.7G.

We absolute should do some more tests with various chunking parameters.

ariel-miculas commented 1 year ago

A paper by Microsoft referenced in https://github.com/ronomon/deduplication

ariel-miculas commented 1 year ago

Using the same 'barehost' layers that @hallyn used, I've repeated the experiment with various values for the minimum, average and maximum chunk sizes. I've used min=avg/4 and max=avg*4, but this is by no means mandated by FastCDC.

Setup

Exporting FastCDC parameters: https://github.com/ariel-miculas/puzzlefs/tree/export_cdc_params

deduplication_stats.pl

Results

Ronomon

ronomon is the old FastCDC implementation used in puzzlefs, see https://docs.rs/fastcdc/latest/fastcdc/ronomon/index.html

Repeating Serge's experiment (with current default puzzlefs parameters for min/avg/max)

❯ ./deduplication_stats.pl -p -s
[2023-02-09T20:00:34Z INFO  puzzlefs] fastcdc will use default parameters:
[2023-02-09T20:01:15Z INFO  puzzlefs] fastcdc will use default parameters:
[2023-02-09T20:01:46Z INFO  puzzlefs] fastcdc will use default parameters:
[2023-02-09T20:02:18Z INFO  puzzlefs] fastcdc will use default parameters:
[2023-02-09T20:02:49Z INFO  puzzlefs] fastcdc will use default parameters:
[2023-02-09T20:03:21Z INFO  puzzlefs] fastcdc will use default parameters:
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5904.57980632782MB
average layer size: 984.09663438797MB
mashed together: 3699.0971326828MB
saved: 2205.48267364502MB

metadata size:
10.25: 3.93935012817383MB
10.26: 3.9374942779541MB
10.27: 3.93743896484375MB
10.28: 3.93803596496582MB
10.29: 3.99929809570312MB
10.30: 3.99527835845947MB

4KB, 16KB, 64KB (4096, 16384, 65535)

❯ ./deduplication_stats.pl -p -s --min=4096 --avg=16384 --max=65535
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5388.45161342621MB
average layer size: 898.075268904368MB
mashed together: 1304.28133773804MB
saved: 4084.17027568817MB

metadata size:
10.25: 7.12127876281738MB
10.26: 7.12182998657227MB
10.27: 7.12417984008789MB
10.28: 7.12332725524902MB
10.29: 7.2124547958374MB
10.30: 7.20955562591553MB

16KB, 64KB, 256KB (16384, 65535, 262144)

❯ ./deduplication_stats.pl -p -s --min=16384 --avg=65536 --max=262144
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5525.756316185MB
average layer size: 920.959386030833MB
mashed together: 1390.05020141602MB
saved: 4135.70611476898MB

metadata size:
10.25: 4.74918556213379MB
10.26: 4.74875640869141MB
10.27: 4.75040817260742MB
10.28: 4.75207328796387MB
10.29: 4.81815147399902MB
10.30: 4.81455707550049MB

64KB, 256KB, 1MB (65535, 262144, 1048576)

❯ ./deduplication_stats.pl -p -s --min=65536 --avg=262144 --max=1048576
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5604.28221321106MB
average layer size: 934.047035535177MB
mashed together: 1513.35793495178MB
saved: 4090.92427825928MB

metadata size:
10.25: 4.14659595489502MB
10.26: 4.14396190643311MB
10.27: 4.14296817779541MB
10.28: 4.14382553100586MB
10.29: 4.20649814605713MB
10.30: 4.20253562927246MB

256KB, 1MB, 4MB (262144, 1048576, 4194304)

❯ ./deduplication_stats.pl -p -s --min=262144 --avg=1048576 --max=4194304
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5663.17061233521MB
average layer size: 943.861768722534MB
mashed together: 1754.05341720581MB
saved: 3909.11719512939MB

metadata size:
10.25: 3.98666572570801MB
10.26: 3.98486328125MB
10.27: 3.98458862304688MB
10.28: 3.98523902893066MB
10.29: 4.04600238800049MB
10.30: 4.04209327697754MB

fastcdc::v2020

This is the current version used in puzzlefs, with defaults: min -> 1MB, avg -> 4MB, max -> 16MB

see https://docs.rs/fastcdc/latest/fastcdc/v2020/index.html

1KB, 4KB, 16KB (1024, 4096, 16384)

❯ ./run_test_suite.sh
./deduplication_stats.pl -p -s --min=1024 --avg=4096 --max=16384
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5242.92396450043MB
average layer size: 873.820660750071MB
mashed together: 1266.22285556793MB
saved: 3976.7011089325MB

metadata size:
10.25: 14.642484664917MB
10.26: 14.6544637680054MB
10.27: 14.6474161148071MB
10.28: 14.6400632858276MB
10.29: 14.7984800338745MB
10.30: 14.7947244644165MB

2KB, 8KB, 32KB (2048, 8192, 32768)

./deduplication_stats.pl -p -s --min=2048 --avg=8192 --max=32768
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5320.87863826752MB
average layer size: 886.81310637792MB
mashed together: 1279.09951400757MB
saved: 4041.77912425995MB

metadata size:
10.25: 9.26667785644531MB
10.26: 9.27443695068359MB
10.27: 9.27176666259766MB
10.28: 9.27629661560059MB
10.29: 9.38424777984619MB
10.30: 9.37905025482178MB

4KB, 16KB, 64KB (4096, 16384, 65535)

./deduplication_stats.pl -p -s --min=4096 --avg=16384 --max=65535
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5405.19478702545MB
average layer size: 900.865797837575MB
mashed together: 1308.83755683899MB
saved: 4096.35723018646MB

metadata size:
10.25: 6.5960111618042MB
10.26: 6.60056686401367MB
10.27: 6.60382556915283MB
10.28: 6.59816741943359MB
10.29: 6.68643760681152MB
10.30: 6.68038749694824MB

16KB, 64KB, 256KB (16384, 65535, 262144)

./deduplication_stats.pl -p -s --min=16384 --avg=65536 --max=262144
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5550.14240837097MB
average layer size: 925.023734728495MB
mashed together: 1401.77648925781MB
saved: 4148.36591911316MB

metadata size:
10.25: 4.62216377258301MB
10.26: 4.62014198303223MB
10.27: 4.62216758728027MB
10.28: 4.62115097045898MB
10.29: 4.68749237060547MB
10.30: 4.68351745605469MB

64KB, 256KB, 1MB (65535, 262144, 1048576)

./deduplication_stats.pl -p -s --min=65536 --avg=262144 --max=1048576
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5610.78800678253MB
average layer size: 935.131334463755MB
mashed together: 1534.92768096924MB
saved: 4075.86032581329MB

metadata size:
10.25: 4.11460208892822MB
10.26: 4.11324787139893MB
10.27: 4.11435031890869MB
10.28: 4.11577129364014MB
10.29: 4.17591190338135MB
10.30: 4.17206001281738MB

256KB, 1MB, 4MB (262144, 1048576, 4194304)

./deduplication_stats.pl -p -s --min=262144 --avg=1048576 --max=4194304
oci, 6 tags
total size: 3112.15474033356MB
average layer size: 518.69245672226MB
mashed together: 2812.81092166901MB
saved: 299.343818664551MB

oci2, 6 tags
total size: 5665.81055545807MB
average layer size: 944.301759243011MB
mashed together: 1778.74466991425MB
saved: 3887.06588554382MB

metadata size:
10.25: 3.98222732543945MB
10.26: 3.9800968170166MB
10.27: 3.98026466369629MB
10.28: 3.98101806640625MB
10.29: 4.04299354553223MB
10.30: 4.03903102874756MB
ariel-miculas commented 1 year ago

I've added results for (1KB, 4KB, 16KB) and (2KB, 8KB, 32KB), for the current v2020 FastCDC implementation. Based on the results, I'm inclined to pick either (4KB, 16KB, 64KB) or (16KB, 64KB, 256KB):

(4KB, 16KB, 64KB)

oci2, 6 tags
total size: 5405 MB
average layer size: 900 MB
mashed together: 1308 MB
saved: 4096 MB

max metadata size:
6.68 MB

(16KB, 64KB, 256KB)

oci2, 6 tags
total size: 5550 MB
average layer size: 925 MB
mashed together: 1401 MB
saved: 4148 MB

max metadata size:
4.68 MB

And based on this comment from ronomon/deduplication

An average chunk size of 64 KB is recommended for optimal end-to-end deduplication and compression efficiency

I would go with (16KB, 64KB, 256KB) as the chunking parameters. We should also repeat this experiment after compression is implemented.

ariel-miculas commented 1 year ago

Code: https://github.com/ariel-miculas/puzzlefs/tree/export_cdc_params_on_compression

<compression with zstd, level 3>

Note: metadata size not accounting the additional digest needed for the uncompressed chunk.

1KB, 4KB, 16KB (1024, 4096, 16384)

+ ./deduplication_stats.pl -p -s --min=1024 --avg=4096 --max=16384
oci2, 6 tags
total size: 2433.18603610992MB
average layer size: 405.531006018321MB
mashed together: 703.590449333191MB
saved: 1729.59558677673MB

metadata size:
10.25: 14.642484664917MB
10.26: 14.6544637680054MB
10.27: 14.6474161148071MB
10.28: 14.6400632858276MB
10.29: 14.7984800338745MB
10.30: 14.7947244644165MB

2KB, 8KB, 32KB (2048, 8192, 32768)

+ ./deduplication_stats.pl -p -s --min=2048 --avg=8192 --max=32768
oci2, 6 tags
total size: 2356.78320503235MB
average layer size: 392.797200838725MB
mashed together: 675.082730293274MB
saved: 1681.70047473907MB

metadata size:
10.25: 9.26667785644531MB
10.26: 9.27443695068359MB
10.27: 9.27176666259766MB
10.28: 9.27629661560059MB
10.29: 9.38424777984619MB
10.30: 9.37905025482178MB

4KB, 16KB, 64KB (4096, 16384, 65535)

+ ./deduplication_stats.pl -p -s --min=4096 --avg=16384 --max=65535
oci2, 6 tags
total size: 2320.51965332031MB
average layer size: 386.753275553385MB
mashed together: 666.441588401794MB
saved: 1654.07806491852MB

metadata size:
10.25: 6.5960111618042MB
10.26: 6.60056686401367MB
10.27: 6.60382556915283MB
10.28: 6.59816741943359MB
10.29: 6.68643760681152MB
10.30: 6.68038749694824MB

16KB, 64KB, 256KB (16384, 65535, 262144)

+ ./deduplication_stats.pl -p -s --min=16384 --avg=65536 --max=262144
oci2, 6 tags
total size: 2282.67422962189MB
average layer size: 380.445704936981MB
mashed together: 673.426959991455MB
saved: 1609.24726963043MB

metadata size:
10.25: 4.62216377258301MB
10.26: 4.62014198303223MB
10.27: 4.62216758728027MB
10.28: 4.62115097045898MB
10.29: 4.68749237060547MB
10.30: 4.68351745605469MB

64KB, 256KB, 1MB (65535, 262144, 1048576)

+ ./deduplication_stats.pl -p -s --min=65536 --avg=262144 --max=1048576
oci2, 6 tags
total size: 2226.37264347076MB
average layer size: 371.062107245127MB
mashed together: 700.592399597168MB
saved: 1525.7802438736MB

metadata size:
10.25: 4.11460208892822MB
10.26: 4.11324787139893MB
10.27: 4.11435031890869MB
10.28: 4.11577129364014MB
10.29: 4.17591190338135MB
10.30: 4.17206001281738MB

256KB, 1MB, 4MB (262144, 1048576, 4194304)

+ ./deduplication_stats.pl -p -s --min=262144 --avg=1048576 --max=4194304
oci2, 6 tags
total size: 2200.82840824127MB
average layer size: 366.804734706879MB
mashed together: 781.297449111938MB
saved: 1419.53095912933MB

metadata size:
10.25: 3.98222732543945MB
10.26: 3.9800968170166MB
10.27: 3.98026466369629MB
10.28: 3.98101806640625MB
10.29: 4.04299354553223MB
10.30: 4.03903102874756MB

Time it took to run the testsuite:

real 10m51.561s
user 8m20.463s
sys 2m27.804s

<compression with zstd, level 9>

1KB, 4KB, 16KB (1024, 4096, 16384)

+ ./deduplication_stats.pl -p -s --min=1024 --avg=4096 --max=16384
oci2, 6 tags
total size: 2386.79584789276MB
average layer size: 397.799307982127MB
mashed together: 694.309480667114MB
saved: 1692.48636722565MB

metadata size:
10.25: 14.642484664917MB
10.26: 14.6544637680054MB
10.27: 14.6474161148071MB
10.28: 14.6400632858276MB
10.29: 14.7984800338745MB
10.30: 14.7947244644165MB

2KB, 8KB, 32KB (2048, 8192, 32768)

+ ./deduplication_stats.pl -p -s --min=2048 --avg=8192 --max=32768
oci2, 6 tags
total size: 2301.7490606308MB
average layer size: 383.624843438466MB
mashed together: 663.855288505554MB
saved: 1637.89377212524MB

metadata size:
10.25: 9.26667785644531MB
10.26: 9.27443695068359MB
10.27: 9.27176666259766MB
10.28: 9.27629661560059MB
10.29: 9.38424777984619MB
10.30: 9.37905025482178MB

4KB, 16KB, 64KB (4096, 16384, 65535)

+ ./deduplication_stats.pl -p -s --min=4096 --avg=16384 --max=65535
oci2, 6 tags
total size: 2256.23154067993MB
average layer size: 376.038590113322MB
mashed together: 653.036324501038MB
saved: 1603.19521617889MB

metadata size:
10.25: 6.5960111618042MB
10.26: 6.60056686401367MB
10.27: 6.60382556915283MB
10.28: 6.59816741943359MB
10.29: 6.68643760681152MB
10.30: 6.68038749694824MB

16KB, 64KB, 256KB (16384, 65535, 262144)

+ ./deduplication_stats.pl -p -s --min=16384 --avg=65536 --max=262144
oci2, 6 tags
total size: 2199.2318944931MB
average layer size: 366.538649082184MB
mashed together: 655.083187103271MB
saved: 1544.14870738983MB

metadata size:
10.25: 4.62216377258301MB
10.26: 4.62014198303223MB
10.27: 4.62216758728027MB
10.28: 4.62115097045898MB
10.29: 4.68749237060547MB
10.30: 4.68351745605469MB

64KB, 256KB, 1MB (65535, 262144, 1048576)

+ ./deduplication_stats.pl -p -s --min=65536 --avg=262144 --max=1048576
oci2, 6 tags
total size: 2127.70233249664MB
average layer size: 354.617055416107MB
mashed together: 677.052481651306MB
saved: 1450.64985084534MB

metadata size:
10.25: 4.11460208892822MB
10.26: 4.11324787139893MB
10.27: 4.11435031890869MB
10.28: 4.11577129364014MB
10.29: 4.17591190338135MB
10.30: 4.17206001281738MB

256KB, 1MB, 4MB (262144, 1048576, 4194304)

+ ./deduplication_stats.pl -p -s --min=262144 --avg=1048576 --max=4194304
oci2, 6 tags
total size: 2085.48339176178MB
average layer size: 347.58056529363MB
mashed together: 750.028922080994MB
saved: 1335.45446968079MB

metadata size:
10.25: 3.98222732543945MB
10.26: 3.9800968170166MB
10.27: 3.98026466369629MB
10.28: 3.98101806640625MB
10.29: 4.04299354553223MB
10.30: 4.03903102874756MB

Time it took to run the testsuite:

real 33m15.188s
user 30m35.945s
sys 2m36.394
ariel-miculas commented 1 year ago

Summary

Size of rootfs tarballs: Tag Size of rootfs tarball (MB)
10.25 1004
10.26 1004
10.27 1004
10.28 1006
10.29 1018
10.30 1018
total 6057
average 1009

Saved space is computed like this:

Legend: zstd.3 = zstd run with compression level 3 zstd.9 = zstd run with compression level 9

Parameters total size (MB) average layer size (MB) mashed together size (MB) saved (MB) max metadata size (MB)
oci 3112 518 2812 2945 (49%) -
1KB, 4KB, 16KB 5242 873 1266 4791 (79%) 14.7
2KB, 8KB, 32KB 5320 886 1279 4778 (79%) 9.3
4KB, 16KB, 64KB 5405 900 1308 4749 (78%) 6.6
16KB, 64KB, 256KB 5550 925 1401 4656 (77%) 4.6
64KB, 256KB, 1MB 5610 935 1534 4523 (75%) 4.1
256KB, 1MB, 4MB 5665 944 1778 4279 (71%) 4
zstd.3 1KB, 4KB, 16KB 2433 405 703 5354 (88%) 14.7
zstd.3 2KB, 8KB, 32KB 2356 392 675 5382 (89%) 9.3
zstd.3 4KB, 16KB, 64KB 2320 386 666 5391 (89%) 6.6
zstd.3 16KB, 64KB, 256KB 2282 380 673 5384 (89%) 4.6
zstd.3 64KB, 256KB, 1MB 2226 371 700 5357 (88%) 4.1
zstd.3 256KB, 1MB, 4MB 2200 366 781 5276 (87%) 4
zstd.9 1KB, 4KB, 16KB 2386 397 694 5363 (89%) 14.7
zstd.9 2KB, 8KB, 32KB 2301 383 663 5394 (89%) 9.3
zstd.9 4KB, 16KB, 64KB 2256 376 653 5404 (89%) 6.6
zstd.9 16KB, 64KB, 256KB 2199 366 655 5402 (89%) 4.6
zstd.9 64KB, 256KB, 1MB 2127 354 677 5380 (89%) 4.1
zstd.9 256KB, 1MB, 4MB 2085 347 750 5307 (88%) 4