llogiq / compressbench

A benchmark of Rust compression libraries
Apache License 2.0
10 stars 4 forks source link

Rust compression libraries article #1

Open VladimirMarkelov opened 3 years ago

VladimirMarkelov commented 3 years ago

Hi,

the article https://blog.logrocket.com/rust-compression-libraries/ is a great summary. Thank you!

Though, I think, that numbers for zip are misleading/incorrect. Two examples:

Table Random: - I think it is about compressing a file of random bytes. Original size is 104.857.600 b and all compressors cannot do anything leaving the file as-is, while zip somehow compresses 100MB of random data into mere 63.868 b.

The similar thing about a few other tables. E.g, rustc binary(3Mb) where only zip was able to compress to 50K - the closest rival made only about 500K. That is suspicious.

SpecificProtagonist commented 3 years ago

The problem is that the position of the Cursor (the whole point of which is that it's seekable) passed to the zip library is taken, not the length of the compressed data.

Also, it looks like the lz4_flex and lzzzz decompression got optimised out. I don't know why, maybe criterion::black_box is broken?

samwho commented 3 years ago

It also appears that the picture of your cat and the movie "Hackers" are formats that have already been compressed, and so don't show much improvement when compressed a second time. These may not be the best files to use for benchmarking purposes.

It would be great to see how well the benchmarks perform with data such as typical JSON/HTML web responses, and log file formats. 🙂