openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.63k stars 1.75k forks source link

Bump ZSTD to v1.5.0 #12081

Open krnaveen14 opened 3 years ago

krnaveen14 commented 3 years ago

Describe the feature would like to see added to OpenZFS

ZSTD Compression support is added from ZFS v2.0 (many Thanks..!!) . Currently ZSTD v1.4.5 is used which itself has lot of speedup and compression improvements.

ZSTD v1.4.7, v1.4.9, v1.5.0 includes significant Compression and Decompression speedups with small Compression Ratio improvements which can benefit the greater community.

How will this feature improve OpenZFS?

ZSTD v1.5.0 can help to narrow the performance gap by little between LZ4 and ZSTD compression options

rincebrain commented 3 years ago

11367

gdevenyi commented 1 year ago

15246

peterdk commented 1 year ago

Since #15246 is a dupe, I just want to reiterate here that I experienced a speed increase from 250MB/s -> 550MB/s sustained write on RAIDZ1 4 SATA SSDs on a 16 core Ryzen 5950X with ZSTD-11 compression. See #15246 for more details. The only thing I changed was the bundled ZSTD version from 1.4.x to 1.5.5. Seems like a big win.

rincebrain commented 1 year ago

Yes, that's probably due to the performance improvement in reworking part of the compression pipeline they added in...I want to say 1.5.1, for levels 7-11, plus fixing a regression I reported in handling incompressible data.

Of course, it turns out to be trickier than just replacing the version 1:1, since you have to be able to handle cases where the data gets recompressed to the old version or you get exciting failures. But I have a branch that does that, which is where I found the regression in question. (See facebook/zstd#3552 for context.)

Cyan4973 commented 1 year ago

I was reminded of this topic by a colleague, and used the opportunity to review @rincebrain's excellent presentation "Refining OpenZFS Compression", which presents a few results, among them an attempt at updating zstd.

It also presents a nice compressibility detection mechanism to improve compression speed on incompressible data, on the ground that zstd, unlike lz4, doesn't have an early abort mechanism. Thing is, it has. But to be more complete, in contrast with lz4, it can be overwhelmed by other parts of the system, as the level increases.

To be more specific, here are the compression speed results at various compression levels using zstd 1.5.5 on an already compressed file cut into blocks of 128 KB, on an old-ish core i7-9700 without turbo:

 1#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 2907.5 MB/s, 8802.1 MB/s
 2#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 2902.0 MB/s, 8812.9 MB/s
 3#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 2667.6 MB/s, 8816.2 MB/s
 4#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 2636.4 MB/s, 8808.6 MB/s
 5#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1723.2 MB/s, 8810.7 MB/s
 6#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1713.2 MB/s, 8807.1 MB/s
 7#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1729.8 MB/s, 8805.8 MB/s
 8#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1729.2 MB/s, 8822.4 MB/s
 9#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1668.4 MB/s, 8813.3 MB/s
10#lesia.tar.L19.zst :  52990423 ->  52995281 (x1.000), 1497.0 MB/s, 8822.8 MB/s
11#lesia.tar.L19.zst :  52990423 ->  52990390 (x1.000),  353.7 MB/s, 8798.8 MB/s
12#lesia.tar.L19.zst :  52990423 ->  52990390 (x1.000),  352.1 MB/s, 8797.5 MB/s
13#lesia.tar.L19.zst :  52990423 ->  52987392 (x1.000),   30.8 MB/s, 8775.0 MB/s
14#lesia.tar.L19.zst :  52990423 ->  52987053 (x1.000),   23.4 MB/s, 8757.6 MB/s
15#lesia.tar.L19.zst :  52990423 ->  52987053 (x1.000),   23.4 MB/s, 8757.6 MB/s
16#lesia.tar.L19.zst :  52990423 ->  52983989 (x1.000),   22.9 MB/s, 8739.6 MB/s
17#lesia.tar.L19.zst :  52990423 ->  52983989 (x1.000),   22.9 MB/s, 8723.3 MB/s
18#lesia.tar.L19.zst :  52990423 ->  52983989 (x1.000),   22.9 MB/s, 8734.1 MB/s
19#lesia.tar.L19.zst :  52990423 ->  52981180 (x1.000),   11.8 MB/s, 8699.0 MB/s

As one can see, the internal early abort of zstd is very good for levels 1-4, and still reasonably effective up to level 10. Starting level 11 though, it quickly gets worse, as the internal skipping is no longer enough to compensate other parts of the system, such as the binary tree maintenance.

Based on these observations, an external compressibility detector such as @rincebrain's lz4 + zstd-1, seems a good idea. I would just recommend starting employing it only for levels 11+, as it seems zstd alone might actually be faster without these pre-detection steps.

Note : to be fair, while an early-abort mechanism has always been there within zstd, it has also been improved in recent versions, which could lead to different conclusion. Here is the same test using v1.4.5 :

 1#lesia.tar.L19.zst :  52990423 ->  52995281 (1.000),1637.6 MB/s ,8856.1 MB/s
 2#lesia.tar.L19.zst :  52990423 ->  52995281 (1.000),1644.2 MB/s ,8851.6 MB/s
 3#lesia.tar.L19.zst :  52990423 ->  52995281 (1.000),1523.6 MB/s ,8864.5 MB/s
 4#lesia.tar.L19.zst :  52990423 ->  52995281 (1.000),1515.6 MB/s ,8859.5 MB/s
 5#lesia.tar.L19.zst :  52990423 ->  52990457 (1.000), 406.5 MB/s ,8835.4 MB/s
 6#lesia.tar.L19.zst :  52990423 ->  52990389 (1.000), 402.4 MB/s ,8849.2 MB/s
 7#lesia.tar.L19.zst :  52990423 ->  52990388 (1.000), 404.6 MB/s ,8833.9 MB/s
 8#lesia.tar.L19.zst :  52990423 ->  52990386 (1.000), 404.6 MB/s ,8836.3 MB/s
 9#lesia.tar.L19.zst :  52990423 ->  52990386 (1.000), 404.5 MB/s ,8841.4 MB/s
10#lesia.tar.L19.zst :  52990423 ->  52990386 (1.000), 404.5 MB/s ,8837.1 MB/s
11#lesia.tar.L19.zst :  52990423 ->  52990389 (1.000), 331.6 MB/s ,8831.0 MB/s
12#lesia.tar.L19.zst :  52990423 ->  52990389 (1.000), 327.7 MB/s ,8841.8 MB/s
13#lesia.tar.L19.zst :  52990423 ->  52987392 (1.000),  36.5 MB/s ,8826.6 MB/s
14#lesia.tar.L19.zst :  52990423 ->  52986997 (1.000),  25.4 MB/s ,8800.1 MB/s
15#lesia.tar.L19.zst :  52990423 ->  52986997 (1.000),  25.4 MB/s ,8801.9 MB/s
16#lesia.tar.L19.zst :  52990423 ->  52983946 (1.000),  25.5 MB/s ,8769.3 MB/s
17#lesia.tar.L19.zst :  52990423 ->  52983946 (1.000),  25.5 MB/s ,8787.2 MB/s
18#lesia.tar.L19.zst :  52990423 ->  52983946 (1.000),  25.5 MB/s ,8773.2 MB/s
19#lesia.tar.L19.zst :  52990423 ->  52981440 (1.000),  13.1 MB/s ,8742.5 MB/s

One can see that, for this version, only levels 1-4 are pretty good. Hence a compressibility pre-detection stage feels justified starting levels 5+.

rincebrain commented 1 year ago

I did know zstd had such a mechanism, and I should have been more precise in my discussion, my apologies.

It was framed around stealing the lz4 mechanisms for this specifically because the folk wisdom around ZFS is that you should use lz4 because of its unique early abort, which is often misunderstood and conflated with ZFS handing a smaller output buffer to the compressor on the assumption that it will error out if it will run off the end there.

The early abort mechanism described is present in OpenZFS 2.2 and some people (myself included) have been using a backport of it on the 2.1 tree, to great effect, so it is helpfully working as intended.

If I recall, I picked zstd-3 specifically because it was the default level, and the cost/benefit tradeoff seemed within noise for running it on anything above zstd-2, so it seemed like a reasonable starting point.