Open klausenbusk opened 7 years ago
mksquashfs
has a bug which leads to great differences in the resulting file even with identical file systems as input. This is caused by multiple workers on different CPUs racing against each other. You can call the tool with -processors 1
to work around this for now. This is what I'm doing in my deployment scripts.
However, given that the content in the squashfs image is compressed on an inode base, the number of blocks that can be reused will never be ideal. You can experiment with the -noI
, -noD
, -noF
and -noX
options. I'd be interested in your findings :)
In addition to what @zonque just said: you need to align the squashfs block size and casync chunk sizes in some way, and I can't really tell you how to do that best, that requires some research. Note that setting the chunk size to the exact same value as the squasfs block size is not the answer: the chunk size you configure in casync is just the average chunk size, meaning that a good part of the chunks will be shorter than the configured value, and another part will be larger. But having smaller casync chunks than the squashfs block size is not useful, as any changed bit in squashfs tends to explode to change the whole block around it, and hence trying to match up parts of it via casync is not going to work.
In the blog story I indicated that this is still left for research. If you are interested in this, I#d very welcome some more comprehensive stats on this. Specifically it might make sense to take some suitable data set (let's say a basic fedora install or so), compress it with various squashfs block sizes, and then run them all through casync also with various average chunk sizes, and draw a graph from that to figure out where the sweet spot lies.
Note that casync's chunking algorithm takes three parameters: the min, the average and the max chunk size. Normally it just expects you to specifiy the average block size, and will then pick the minimum chunk size as 1/4th of it, and the maximum as 4x it. You can alter those values too by using --chunk-size=MIN:AVG:MAX, but do note that the way AVG is currently process means that setting MIN/MAX to anything else than the 0.25x and 4x will skew the chunk histogram in a way that AVG is not actually the everage chunk size anymore, if you follow what I mean. Long story short: unless you know what you do, don't bother with changing MIN/MAX, but do keep in mind that MIN is picked as 1/4th of AVG and that AVG is what you choose.
Also, please keep in mind that large chunk sizes mean that casync is unable to recognize smaller patterns. By picking a small chunk size you hence increase the chance that casync recognizes similar data, but the metadata overhead increases.
or to say all this in different words: I have the suspicion that you get best results if you pick a squashfs block size that is relatively small and that the average chunk size you the configure casync for is at least four times larger.
You can call the tool with -processors 1 to work around this for now.
That made a significant difference:
$ casync make --chunk-size=131072 foo.caibx foo.squashfs
ea4de6574bd73cdd7dd1448324c97e5d4c313301f18e53c638f5ad023231dc93
$ du -hs default.castr/
393M default.castr/
$ casync make --chunk-size=131072 foo2.caibx foo2.squashfs
1dd2e733ca1683cd8b4399b3e66cef0acc552da0942464c9041451df38d1c113
$ du -hs default.castr/
538M default.castr/
376+386-538 = 224M / (224/386*100) = 58% reuse.. (Edit: I'm not sure about the math anymore)
You can experiment with the -noI, -noD, -noF and -noX options. I'd be interested in your findings :)
With all the options on:
$ mksquashfs squashfs-root/ foo.squashfs -comp xz -processors 1 -noI -noD -noF -noX
Parallel mksquashfs: Using 1 processor
Creating 4.0 filesystem on foo.squashfs, block size 131072.
[=================================================================================================================================/] 50696/50696 100%
Exportable Squashfs 4.0 filesystem, xz compressed, data block size 131072
uncompressed data, uncompressed metadata, uncompressed fragments, uncompressed xattrs
duplicates are removed
Filesystem size 1062789.11 Kbytes (1037.88 Mbytes)
97.17% of uncompressed filesystem size (1093687.08 Kbytes)
Inode table size 1917191 bytes (1872.26 Kbytes)
100.00% of uncompressed inode table size (1917191 bytes)
Directory table size 1302032 bytes (1271.52 Kbytes)
100.00% of uncompressed directory table size (1302032 bytes)
Number of duplicate files found 3835
Number of inodes 55217
Number of files 45748
Number of fragments 3469
Number of symbolic links 5827
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 3642
Number of ids (unique uids + gids) 1
Number of uids 1
kristian (1000)
Number of gids 1
kristian (1000)
$ dd if=/dev/urandom of=squashfs-root/foo bs=1M count=10
10+0 blokke ind
10+0 blokke ud
10485760 byte (10 MB, 10 MiB) kopieret, 0,121281 s, 86,5 MB/s
$ rm squashfs-root/etc/*.conf
$ mksquashfs squashfs-root/ foo2.squashfs -comp xz -processors 1 -noI -noD -noF -noX
Parallel mksquashfs: Using 1 processor
Creating 4.0 filesystem on foo2.squashfs, block size 131072.
[=================================================================================================================================-] 50741/50741 100%
Exportable Squashfs 4.0 filesystem, xz compressed, data block size 131072
uncompressed data, uncompressed metadata, uncompressed fragments, uncompressed xattrs
duplicates are removed
Filesystem size 1072944.86 Kbytes (1047.80 Mbytes)
97.20% of uncompressed filesystem size (1103835.71 Kbytes)
Inode table size 1916421 bytes (1871.50 Kbytes)
100.00% of uncompressed inode table size (1916421 bytes)
Directory table size 1301745 bytes (1271.24 Kbytes)
100.00% of uncompressed directory table size (1301745 bytes)
Number of duplicate files found 3830
Number of inodes 55183
Number of files 45714
Number of fragments 3469
Number of symbolic links 5827
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 3642
Number of ids (unique uids + gids) 1
Number of uids 1
kristian (1000)
Number of gids 1
kristian (1000)
and now we talking:
$ casync make --chunk-size=131072 foo.caibx foo.squashfs
fa8da5dc23cd3b0ba17f81c5f6a1e0f3cebc9543307392b93d29c7a833932f3e
$ du -hs default.castr/
411M default.castr/
$ casync make --chunk-size=131072 foo2.caibx foo2.squashfs
f62a294140bbcae6c083e5ffa4096cf921901b07c379c573da456dcbdc02a964
$ du -hs default.castr/
467M default.castr/
That isn't bad at all, it did reuse 411/467=88% of the old chunks.
or to say all this in different words: I have the suspicion that you get best results if you pick a squashfs block size that is relatively small and that the average chunk size you the configure casync for is at least four times larger.
Hang on, and I will have some data soon.
What's the size of the squashfs image, with and without compression?
What's the size of the squashfs image, with and without the compression turned on?
You can see in in the mksquashfs
log, but here you go:
With xz comp:
foo.squashfs Filesystem size 384621.17 Kbytes (375.61 Mbytes)
foo2.squashfs Filesystem size 394833.51 Kbytes (385.58 Mbytes)
Without:
foo.squashfs: Filesystem size 1062789.11 Kbytes (1037.88 Mbytes)
foo2.squashfs: Filesystem size 1072944.86 Kbytes (1047.80 Mbytes)
btw, it'd be excellent if the final findings could be compiled into some document we can add to the package, since I am sure this will pop up again and again
So with compression turned on, 42% of ~380MB (~159MB) and without compression, 12% of ~1040MB (~124MB) are not reused and have to be downloaded when an update is made. So even though the reuse percentage looks better, the actual effect isn't that high.
or to say all this in different words: I have the suspicion that you get best results if you pick a squashfs block size that is relatively small and that the average chunk size you the configure casync for is at least four times larger.
Hang on, and I will have some data soon.
So I created the squashfs files with -b 32K
everything else was the same, and it only made it worse.
$ casync make --chunk-size=131072 foo.caibx foo.squashfs
ffa00e9f8fea2d3c007312a2d2465b560d030fd3e19a45aa197beb2223c08379
$ du -hs default.castr/
411M default.castr/
$ casync make --chunk-size=131072 foo2.caibx foo2.squashfs
56a8e60b88bde1750957c2c74940169c6aa8120d7f11abf6288ad3a9bb5786bb
$ du -hs default.castr/
486M default.castr/
mksquashfs log:
$ mksquashfs squashfs-root/ foo.squashfs -comp xz -processors 1 -b 32K -noI -noD -noF -noX
Parallel mksquashfs: Using 1 processor
Creating 4.0 filesystem on foo.squashfs, block size 32768.
[=================================================================================================================================\] 71566/71566 100%
Exportable Squashfs 4.0 filesystem, xz compressed, data block size 32768
uncompressed data, uncompressed metadata, uncompressed fragments, uncompressed xattrs
duplicates are removed
Filesystem size 1054723.85 Kbytes (1030.00 Mbytes)
96.42% of uncompressed filesystem size (1093840.92 Kbytes)
Inode table size 2012790 bytes (1965.62 Kbytes)
100.00% of uncompressed inode table size (2012790 bytes)
Directory table size 1301552 bytes (1271.05 Kbytes)
100.00% of uncompressed directory table size (1301552 bytes)
Number of duplicate files found 3835
Number of inodes 55217
Number of files 45748
Number of fragments 7366
Number of symbolic links 5827
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 3642
Number of ids (unique uids + gids) 1
Number of uids 1
kristian (1000)
Number of gids 1
kristian (1000)
$ mksquashfs squashfs-root/ foo2.squashfs -comp xz -processors 1 -b 32K -noI -noD -noF -noX
Parallel mksquashfs: Using 1 processor
Creating 4.0 filesystem on foo2.squashfs, block size 32768.
[=================================================================================================================================-] 71851/71851 100%
Exportable Squashfs 4.0 filesystem, xz compressed, data block size 32768
uncompressed data, uncompressed metadata, uncompressed fragments, uncompressed xattrs
duplicates are removed
Filesystem size 1064879.91 Kbytes (1039.92 Mbytes)
96.46% of uncompressed filesystem size (1103989.87 Kbytes)
Inode table size 2012993 bytes (1965.81 Kbytes)
100.00% of uncompressed inode table size (2012993 bytes)
Directory table size 1300677 bytes (1270.19 Kbytes)
100.00% of uncompressed directory table size (1300677 bytes)
Number of duplicate files found 3830
Number of inodes 55183
Number of files 45714
Number of fragments 7362
Number of symbolic links 5827
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 3642
Number of ids (unique uids + gids) 1
Number of uids 1
kristian (1000)
Number of gids 1
kristian (1000)
Did you play with casync's --chunk-size=
parameter as well?
So with compression turned on, 42% of ~380MB (~159MB) and without compression,
Where did you get 42% from?
12% of ~1040MB (~124MB) are not reused and have to be downloaded when an update is made. So even though the reuse percentage looks better, the actual effect isn't that high.
The ~ 1040MB is the squashfs file without compression, casync
compress it to 411MB and the secondary squashfs file only add 467-411=56MB more chunks. So the client would need to download 56MB. See https://github.com/systemd/casync/issues/46#issuecomment-311084615 (bottom).
Did you play with casync's --chunk-size= parameter as well?
It was the same as before (131072
), which is 4x the squashfs block size.
Where did you get 42% from?
Ah, sorry. My bad. I'll do some tests again soon myself. Last time I did them, casync would still use Adler32 instead of buzhash, but my numbers were similar IIRC.
It was the same as before (131072), which is 4x the squashfs block size.
Yeah, but you could try and alter both block sizes (squashfs and casync).
Yeah, but you could try and alter both block sizes (squashfs and casync).
mksquashfs
with -b 64K
:
$ casync make --chunk-size=256K foo.caibx foo.squashfs
08bfcbe52ac62383ff3d099ba57e5a4845d38899bf8d4fe7f4567f296ddc944a
$ du -hs default.castr/
377M default.castr/
$ casync make --chunk-size=256K foo2.caibx foo2.squashfs
ea53a0b771d0945ff019e4978d5d7b8e381368fc31de1d535e8ed95dc674e22a
$ du -hs default.castr/
476M default.castr/
-------
casync make --chunk-size=64K foo.caibx foo.squashfs
08bfcbe52ac62383ff3d099ba57e5a4845d38899bf8d4fe7f4567f296ddc944a
$ du -hs default.castr/
462M default.castr/
$ casync make --chunk-size=64K foo2.caibx foo2.squashfs
ea53a0b771d0945ff019e4978d5d7b8e381368fc31de1d535e8ed95dc674e22a
$ du -hs default.castr/
520M default.castr/
so....
More data:
$ casync make --chunk-size=128K foo.caibx foo.squashfs
08bfcbe52ac62383ff3d099ba57e5a4845d38899bf8d4fe7f4567f296ddc944a
$ du -hs default.castr/
411M default.castr/
$ casync make --chunk-size=128K foo2.caibx foo2.squashfs
ea53a0b771d0945ff019e4978d5d7b8e381368fc31de1d535e8ed95dc674e22a
$ du -hs default.castr/
486M default.castr/
$ casync make --chunk-size=192K foo.caibx --store=192 foo.squashfs
08bfcbe52ac62383ff3d099ba57e5a4845d38899bf8d4fe7f4567f296ddc944a
$ du -hs default.castr/
436M default.castr/
$ casync make --chunk-size=192K foo2.caibx --store=192 foo2.squashfs
ea53a0b771d0945ff019e4978d5d7b8e381368fc31de1d535e8ed95dc674e22a
$ du -hs default.castr/
486M default.castr/
I have done some experiments here to see if it's worth replacing our current VCDIFF based updater for squashfs , and the best I end up with is an order of magnitude worse than vcdiff. (56Mib vs 4.9 MiB)
squashfs created with: -comp lzo -processors 1
for all the filesystems, and then varying the block sizes.
Lower block sizes in general seem to give much better delta compreession here.
squashfs block | casync block | before | after | delta | vcdiff size |
---|---|---|---|---|---|
128k | 128k | 109448380 | 159866408 | 50418028 | 5347571 |
128k | 64k | 114279552 | 158378284 | 44098732 | 5347571 |
128k | 196k | 107544744 | 166066400 | 58521656 | 5347571 |
64k | 196k | 108565560 | 182307500 | 73741940 | 4942274 |
64k | 64k | 115702772 | 173936656 | 58233884 | 4942274 |
64k | 128k | 110695244 | 176925480 | 66230236 | 4942274 |
64k | 4M | 101183276 | 203963016 | 102779740 | 4942274 |
64k | 32k | 124410768 | 170925584 | 46514816 | 4942274 |
64k | 16k | 139519084 | 176285808 | 36766724 | 4942274 |
256k | 16k | 136321764 | 162653496 | 26331732 | 5401169 |
256k | 32k | 121683700 | 153911184 | 32227484 | 5401169 |
256k | 48k | 116118020 | 154677656 | 38559636 | 5401169 |
256k | 64k | 113610964 | 154900344 | 41289380 | 5401169 |
256k | 96k | 110534840 | 158229752 | 47694912 | 5401169 |
256k | 128k | 108844516 | 158764984 | 49920468 | 5401169 |
256k | 160k | 107556980 | 164247960 | 56690980 | 5401169 |
256k | 256k | 105766892 | 169314368 | 63547476 | 5401169 |
256k | 8K | 162973404 | 187133732 | 24160328 | 5401169 |
256k | 10K | 152101388 | 176791976 | 24690588 | 5401169 |
256k | 12K | 145522592 | 170632432 | 25109840 | 5401169 |
256k | 14K | 140354180 | 165779628 | 25425448 | 5401169 |
mean uncompressed inside squashfs file is 22k large The filesystems are root filesystems for ARM, designed to fit in a 128MiB partition. I'm continuing some testing to get more numbers here.
I've done more comparisions on various block sizes of squashfs + casync. Here are all the deltas that are in "acceptable" size ( <28MiB) as selection.
squashfs block | casync block | before | after | delta |
---|---|---|---|---|
8K | 2k | 300193072 | 326454100 | 26261028 |
32K | 2K | 286269080 | 313759400 | 27490320 |
64k | 2K | 279804704 | 303120420 | 23315716 |
64k | 4K | 213521444 | 239195880 | 25674436 |
128k | 2K | 274465020 | 299353036 | 24888016 |
128k | 4K | 208666884 | 235405796 | 26738912 |
128k | 8K | 164161172 | 192140344 | 27979172 |
256k | 2K | 272197548 | 296360464 | 24162916 |
256k | 4K | 207236620 | 231661136 | 24424516 |
256k | 8K | 162952368 | 187819784 | 24867416 |
256k | 12k | 145523996 | 171457008 | 25933012 |
256k | 16K | 136290180 | 163536448 | 27246268 |
512K | 2K | 271066580 | 294758225 | 23691644 |
513K | 4K | 206668208 | 230244976 | 23576768 |
513K | 8K | 162075772 | 185663468 | 23587696 |
513K | 12k | 144542776 | 168996860 | 24454084 |
513K | 16K | 135811844 | 161168396 | 25356552 |
513K | 24K | 126159688 | 153858040 | 27698352 |
Worth noting is that when blocksize is < 8K or similar, the storage area for the blocks on disk is larger than the storage area of the individiual block files themselves.
i was wondering if it makes more sense to skip squashfs completely and write the chunkstore directly to flash. Squashfs has never helped us much since it compresses files individually.
Squashfs doesn't compress files individually. It compresses blocks, a block can contain many files due to tail packing. Side note: That's why you see worse deltas between similar filesystems on squashfs when you use better compression levels. gzip/lzma cause huge deltas.
Btw, mksquashfs just got predictable: https://github.com/plougher/squashfs-tools/commit/e0d74d07bb350e24efd3100d3798f4f6d893a3d9 Maybe an opportunity to reevaluate the situation..
Hello
I just did a little experimenting with casync and squashfs, to check the potential.
So I used the ArchLinux netboot squashfs as the base.
Then I created 2 near indenticaly squashfs files from that folder:
So now I have:
Now lets create some casync archive (??):
So it is reusing: (foo.squashfs+foo2.squashfs-total) = 376+386-625=137 MB worth of chunks or in other words, the client can save 35% traffic (137/386*100).
35% doesen't seems that high, considering the very minimal changes to the filesystem. Do you think this can be improved?
-- Kristian