scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
54 stars 93 forks source link

Compress Scylla coredumps with something faster than gzip #7234

Closed michoecho closed 3 months ago

michoecho commented 6 months ago

Please compress the core files with zstd (pzstd) instead of gzip. They should have similar compression ratios, but zstd compresses and decompresses several times faster than zlib.

mykaul commented 6 months ago

Also see https://github.com/scylladb/qa-tasks/issues/1372

mykaul commented 6 months ago

And perhaps more importantly - https://github.com/scylladb/scylla-machine-image/issues/462

fruch commented 6 months ago

we probably gonna defer to SMI to implement it, and then SCT to identify and send it as is

soyacz commented 6 months ago

we probably gonna defer to SMI to implement it, and then SCT to identify and send it as is

we're not always use SMI, possibly worth to fix it anyway.

mykaul commented 4 months ago

Example:

ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 
443M    core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz
443M    total
ykaul@ykaul:~/Downloads$ pigz -d core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 
ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
58G core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000
58G total
ykaul@ykaul:~/Downloads$ zstd core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 
ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst 
145M    core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst
145M    total

It's not only faster, but requires substantially less storage.

nyh commented 4 months ago

@mykaul you didn't show the time measurement. Note that I think with zstd, you need to pass an option (e.g., -T0) to make it use all the cores otherwise it uses just one (but please correct me if I'm wrong).

But the size saving is indeed impressive. My guess is that it is related to amazing and possibly uncommo compression ratio achieved in this case (400x!). I guess the ANS entropy coding beats the pants off the old Huffman coding used by gzip in this case. I'm not sure in every case the compression of our core files will be this impressive, or show such dramatic improvement of zstd over gzip.

mykaul commented 4 months ago

Right - I did not really look at compress/decompress time, for few reasons:

  1. I did use pigz to uncompress, could probably have used it for compression for a fair comparison between them.
  2. I don't care THAT much about times, I care more on how much I save time on transfer from the node, and more importantly, to the developer's laptop!
  3. The saving on Google Storage is also more important than time to compress.

Time to compress is indeed relevant if we care how long Scylla is down. I assume (from past experience) zstd is faster / as good as gzip. Since it writes hundreds of MBs less to the disk, I assume the overall is anyway faster.

mykaul commented 4 months ago

Odd, but OK:

ykaul@ykaul:~/Downloads$ time pigz -d core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 

real    0m50.731s
user    0m17.107s
sys 0m38.388s
ykaul@ykaul:~/Downloads$ time zstd core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real    0m33.472s
user    0m26.812s
sys 0m16.672s
ykaul@ykaul:~/Downloads$ time zstd -T0 core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real    0m35.666s
user    0m28.647s
sys 0m16.491s
mykaul commented 4 months ago

And now just for completeness and I'm done:

ykaul@ykaul:~/Downloads$ time zstd --long core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.22%   (  57.4 GiB =>    129 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real    1m24.424s
user    1m23.063s
sys 0m16.685s
ykaul@ykaul:~/Downloads$ time zstd -5  core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.23%   (  57.4 GiB =>    138 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real    0m35.549s
user    0m30.594s
sys 0m15.575s
michoecho commented 4 months ago

I did use pigz to uncompress, could probably have used it for compression for a fair comparison between them.

Tangential note: gzip decompression can't be parallelized, the file format doesn't allow it. Every piece of output depends on all pieces of output before it. (So using pigz for decompression doesn't really speed it up).

zstd by default also produces a file with inter-block dependencies, which can't be decompressed in parallel.

But pzstd produces an output file which is split into independent blocks, and can be decompressed in parallel. That's why I mentioned it in the OP.

The kernel allows you to pass the core to an arbitrary compression command, so you could use pzstd for this. But systemd-coredump only has a fixed set of compression options. So I guess if we are stuck with systemd-coredump, then we can't make use of pzstd (or even zstd -T0)...

Time to compress is indeed relevant if we care how long Scylla is down. I assume (from past experience) zstd is faster / as good as gzip. Since it writes hundreds of MBs less to the disk, I assume the overall is anyway faster.

The consensus around the internet seems to be that zstd handily beats gzip in all performance aspects, and I see no reason to doubt it.

fruch commented 3 months ago

this is gonna be handled by https://github.com/scylladb/scylla-machine-image/issues/462

we would have work to adapt, so we don't compress it again, but that's it, the rest would be done of of the box