quixdb / squash

Compression abstraction library and utilities
https://quixdb.github.io/squash/
MIT License
406 stars 53 forks source link

Support for LZ4HC #209

Closed DarkZeros closed 8 years ago

DarkZeros commented 8 years ago

I would like to have this compressor listed. It has the interesting properly of ultra-low compression speeds, but good compression ratio and as fast as LZ4.

Or is it already there in the LZ4 high level options?

DarkZeros commented 8 years ago

My fault, it IS there, just hidden:

Level (integer, 1-14, default 7) — higher level corresponds to better compression ratio but slower compression speed. 1 — LZ4 fast mode (LZ4_compress_fast) acceleration 32 2 — LZ4 fast mode acceleration 24 3 — LZ4 fast mode acceleration 17 4 — LZ4 fast mode acceleration 8 5 — LZ4 fast mode acceleration 4 6 — LZ4 fast mode acceleration 2 7 — The default algorithm (LZ4_compress) 8 — LZ4HC level 2 9 — LZ4HC level 4 10 — LZ4HC level 6 11 — LZ4HC level 9 12 — LZ4HC level 12 13 — LZ4HC level 14 14 — LZ4HC level 16

nemequ commented 8 years ago

It's really not supposed to be hidden. Codec names in Squash are used to convey compatibility; e.g., "zlib" from the zlib plugin, the zlib-ng plugin, and the miniz plugin are compatible with one another. LZ4HC isn't really a separate codec, it's a second implementation of the same codec in the lz4 repository (and, of course, the lz4 plugin). At some point I'd like to add a plugin for lz4x, which would also use the "lz4-raw" name.

Many libraries have different implementations at different compression levels (instead of just varying settings like block/window size), especially for the fastest and slowest levels. Yann's decision to use a different name in the same repository is a bit weird, but I imagine it's to help convey that it is a different implementation. If, instead of creating lz4hc alongside of lz4 in the lz4 repository, he had split lz4hc into a separate library then we would have an "lz4hc" plugin with "lz4" and "lz4-raw" codecs, just like the "lz4" plugin.

I'm open to ideas about how to make this all easier to understand, but TBH I don't hold out much hope. The LZ4HC name is a bit of an aberration; I really can't think of any other libraries which do something similar, and it makes more sense for Squash to reflect the norm, not the exception.