to consider - Githubissues

r-lyeh-archived / bundle

:package: Bundle, an embeddable compression library: DEFLATE, LZMA, LZIP, BZIP2, ZPAQ, LZ4, ZSTD, BROTLI, BSC, CSC, BCM, MCM, ZMOLLY, ZLING, TANGELO, SHRINKER, CRUSH, LZJB and SHOCO streams in a ZIP file (C++03)(C++11)

zlib License

610 stars 86 forks source link

to consider #10

Closed r-lyeh-archived closed 8 years ago

r-lyeh-archived commented 8 years ago

~~richox/zmolly~~

r-lyeh-archived commented 8 years ago

~~https://github.com/fusiyuan2010/CSC - compares to lzma (lzma25 < lzma20 < csc)~~ ~~https://code.google.com/p/data-shrinker - compares to lz4 (lz4hc < lz4 < shrinker)~~

mavam commented 8 years ago

lz4hc < lz4 < shrinker

JFYI: According to my measurements with PCAP traces, shrinker trades slightly lower throughput for a small in improvement in compression ratio compared to LZ4. For ASCII data, I observe the opposite.

r-lyeh-archived commented 8 years ago

Yep it is somewhat in the middle of both LZ4s. Anyways, for only 200 lines of very portable code it's a pretty beast :)

r-lyeh-archived commented 8 years ago

http://www.byronknoll.com/cmix.html zpaq style, improved ratios, too much memory hungry (>32 GiB) tried to reduce memory requeriments and didnt work

mavam commented 8 years ago

zpaq style, improved ratios, too much memory hungry (>32 GiB)

Woah, what's the input size you're trying to compress? Are you sure there's not a leak somewhere? Such draconian memory requirements seem a bit off.

r-lyeh-archived commented 8 years ago

enwik8, the 100 MiB canonical file used widely for compression tests and benchmarks :)

mavam commented 8 years ago

No disrespect, but it strikes me as highly unpractical to keep in-memory state that's bigger than the input by over a factor of 300. Who can afford to use such a library?! Smells like a buggy or flawed implementation.

r-lyeh-archived commented 8 years ago

Well, this is the hardcore side of compression. Done for fun or for research purposes. And totally prohibitive for regular users.

There are even compression contests like this: https://en.wikipedia.org/wiki/Hutter_Prize

I was curious to see what I could achieve by reducing cmix memory requeriments by 100 (and to see what kind of new (bigger) compression ratio I could get from it).

It didnt work in the end :/, so I have spent the day with other encodings :)

mavam commented 8 years ago

Ok, an academic exercise is always legit. :smile: