ckolivas / lrzip

Long Range Zip
http://lrzip.kolivas.org
GNU General Public License v2.0
619 stars 76 forks source link

zstd support #61

Closed ole-tange closed 2 years ago

ole-tange commented 7 years ago

I am horribly impressed by (p)zstd https://github.com/facebook/zstd

The parallelized version is blazingly fast on a multicore system: You are getting gzip compression at lzo speed.

Will that be an option in addition to --zpaq --bzip2 --gzip --lzo --lzma?

ckolivas commented 7 years ago

No, as the rate limiting step with the faster compressors is the rzip first part. Speeding up the 2nd stage compression has no effect on overall speed.

ckolivas commented 3 years ago

Will reconsider for later version now as it will introduce an incompatibility issue with existing versions.

pete4abw commented 3 years ago

You can always do lrzip -n and feed the output to zstd. IMHO adding another compressor that does not improve overall compression is regressive. In fact, using lrzip -L4 is just about as fast as bzip and gzip and provides better compression Remember lrzip's mission. See These Test Comparison Results

mirh commented 3 years ago

Idk about lrzip's mission, but it seems dumb to still make comparisons with bzip/deflate (and also probably PAQ, but at least that still hasn't been fully Pareto'd yet). The kernel is even planning to remove them now that zstd is a thing.

ckolivas commented 3 years ago

I said nothing about a mission.

pete4abw commented 3 years ago

I said nothing about a mission. I did. Poor choice of words. Intent perhaps w/b better. Provide better compression. Anyway, gzip and bzip2 are going nowhere. Adding a new method that is no better makes little sense. Zpaq made sense since it can be superior to lzma. JM2C

mirh commented 3 years ago

No better? I'm not sure I underlined enough how zstd is better in all metrics in all possible cases than bzip (maybe also deflate, but I cannot remember a graph covering all the spectrum of options there)

ckolivas commented 3 years ago

I guess you're missing his point as well. Neither Bzip2 nor deflate is used by default ever - at all - on lrzip, unless the user goes out of their way to choose it, and no one would today. Where I see it offering a use case is as an alternative to when users choose lzo for fast compression, zstd will offer similar speeds but better compression.

pete4abw commented 3 years ago

No better? I'm not sure I underlined enough how zstd is better in all metrics in all possible cases than bzip (maybe also deflate, but I cannot remember a graph covering all the spectrum of options there)

I did a test run. It appears, at least for now that lrzip does not offer a big advantage for zstd over zstd alone.

Size Filename Time Description
265297344 linux-5.10.tar.n.zst 0:02 zstd -T0 --fast compressed after lrzip -n
205969542 linux-5.10.tar.n.zst 0:04 zstd -T0 (level 3 is default) compressed after lrzip -n
897427042 linux-5.10.tar.n 0:12 lrzip -n -S .n
174165751 linux-5.10.tar.lrz 0:22 lrzip -L4
158270670 linux-5.10.tar.n.zst 1:22 zstd -17 -T0 compressed after lrzip -n
160005503 linux-5.10.tar.zst 1:38 zstd -17 -T0
143647662 linux-5.10.tar.lrz 2:23 lrzip -L9
149423135 linux-5.10.tar.n.zst 5:23 zstd -T0 --ultra -22 (max) compressed after lrzip -n

1158307840 | linux-5.10.tar | N/A | Original File

EDIT: Added more comparisons and against lrzip level 4 which overall provides the best in terms of time and compression. lrzip -L9 is faster (by 3 minutes) and better than zstd --ultra -22. Other than being able to say "we can do it", which of course, we could (there is a libzstd), I just don't see the reason. In fact, an argument can be made that none of the other compression modes are even needed. Between the multi-threading and speed of lzma relative to others, cf zpaq, why use anything else? Oh, and is the lzma SDK 19, there is assembler decompression which is up to 40% faster.

mirh commented 3 years ago

Interesting. I wonder if that is also the case with --fast, --ultra or --long.

pete4abw commented 3 years ago

Interesting. I wonder if that is also the case with --fast, --ultra or --long.

See above edits