r-lyeh-archived / bundle

:package: Bundle, an embeddable compression library: DEFLATE, LZMA, LZIP, BZIP2, ZPAQ, LZ4, ZSTD, BROTLI, BSC, CSC, BCM, MCM, ZMOLLY, ZLING, TANGELO, SHRINKER, CRUSH, LZJB and SHOCO streams in a ZIP file (C++03)(C++11)
zlib License
610 stars 86 forks source link

new contenders (enwik8 tests) #13

Closed r-lyeh-archived closed 8 years ago

r-lyeh-archived commented 8 years ago
// tangelo 2.4, faster zpaq, less filters
~/> tangelo c enwik8 enwik8.tangelo
Compressed from 100000000 -> 21031051 bytes. 97.24 secs

// bcm
~/> bcm enwik8 enwik8.bcm
enwik8: 100000000 -> 22531851 in 26.419s
26.5055 s.

// zmolly 0.0.1
~/> zmolly-x86_64 -e enwik8 enwik8.zmolly
encode-block: 16777216 => 4074603
encode-block: 16777216 => 3940888
encode-block: 16777216 => 3935657
encode-block: 16777216 => 3935740
encode-block: 16777216 => 3924577
encode-block: 16113920 => 3752436
10000000 -> 23.563.956 
16.9817 s.

// libzling (fast)
~/> zling e0 enwik8 enwik8.zling0
encode: 100000000 => 32378861, time=3.732 sec, speed=26.795 MB/sec

// libzling (extra)
~/> zling e4 enwik8 enwik8.zling4
encode: 100000000 => 30707658, time=6.457 sec, speed=15.487 MB/sec

// MCM v0.83 (mode=6)
mcm -t ..\data\enwik8 ..\data\enwik8.mcmt
100,000,000->21,020,366 in 39.107s
mcm -f ..\data\enwik8 ..\data\enwik8.mcmf
100,000,000->19,922,132 in 42.336s
mcm -m ..\data\enwik8 ..\data\enwik8.mcmm
100,000,000->19,201,503 in 50.718s
mcm -h ..\data\enwik8 ..\data\enwik8.mcmh
100,000,000->19,166,857 in 60.895s
mcm -x ..\data\enwik8 ..\data\enwik8.mcmx
100,000,000->19,057,576 in 69.599s

// MCM v0.83 (mode=9)
mcm -t9 ..\data\enwik8 ..\data\enwik8.mcmt
100,000,000->20,180,965 in 44.387s
mcm -f9 ..\data\enwik8 ..\data\enwik8.mcmf
100,000,000->19,305,599 in 47.988s
mcm -m9 ..\data\enwik8 ..\data\enwik8.mcmm
100,000,000->18,624,824 in 53.92700s
mcm -h9 ..\data\enwik8 ..\data\enwik8.mcmh
100,000,000->18,501,425 in 70.372s
mcm -x9 ..\data\enwik8 ..\data\enwik8.mcmx
100,000,000->18,377,349 in 77.48600s

// discarded

// crush (extra)
~/> crush cx enwik8 enwik8.crush
Compressing enwik8...
100000000 -> 31731711 in 489.281s
489.361 s.

// crush (fast)
~/> crush cf enwik8 enwik8.crushf
Compressing enwik8...
100000000 -> 37308893 in 5.235s
5.33109 s.
r-lyeh-archived commented 8 years ago

also with xwrt filters:

// FILTERS

xwrt.exe -l14 -b255 -m96 -s -e40000 -f200 enwik8
XWRT 3.2 (29.10.2007) - XML compressor by P.Skibinski, inikep@gmail.com
* Compression level=14
* Maximum buffer for creating dynamic dictionary size is 255 MB
* Maximum memory buffer size is 96 MB
* Spaces modeling is off
* Maximum dictionary size is 40000
* Minimal word frequency is 200
enwik8.xwrt already exists, overwrite?: <Y>es, <N>o, <A>ll, <Q>uit?- encoding enwik8 to enwik8.xwrt (lpaq6 1542 MB)
 + dynamic dictionary 399333/16711680 words
 + loaded dictionary 7065/559168 words
 + dynamic dictionary 7065/559168 words
 + encoding finished (100000000->18679742 bytes, 1.494 bpc) in 121.33s (805 kb/s)

xwrt.exe -1 -b255 -m96 -e40000 -f200 enwik8
XWRT 3.2 (29.10.2007) - XML compressor by P.Skibinski, inikep@gmail.com
* LZMA/BWT optimized preprocessing
* Maximum buffer for creating dynamic dictionary size is 255 MB
* Maximum memory buffer size is 96 MB
* Maximum dictionary size is 40000
* Minimal word frequency is 200
enwik8.xwrt already exists, overwrite?: <Y>es, <N>o, <A>ll, <Q>uit?- encoding enwik8 to enwik8.xwrt (store)
 + dynamic dictionary 399333/16711680 words
 + loaded dictionary 7068/101472 words
 + dynamic dictionary 7068/101472 words
 + encoding finished (100000000->54207269 bytes, 4.337 bpc) in 11.88s (8217 kb/s)

bundler: Bundler 2.0.9 (RELEASE). Compiled on Oct 29 2015 - https://github.com/r-lyeh/bundler
[ OK ] enwik8.xwrt: 54207269 -> 23014895 (42.4572%) (LZMA25); 43.478 secs
[ OK ] enwik8.xwrt: 54207269 -> 20748926 (38.277%) (BSC); 29.237 secs
[ OK ] enwik8.xwrt: 54207269 -> 25982012 (47.9309%) (BROTLI9); 126.773 secs

..\data\bundler.exe: Bundler 2.0.9 (RELEASE). Compiled on Oct 29 2015 - https://github.com/r-lyeh/bundler
[ OK ] enwik8.xwrt: 44480481 -> 21466716 (48.261%) (BSC); 27.642 secs
[ OK ] enwik8.xwrt: 44480481 -> 34502508 (77.5677%) (LZ4F); 1.198 secs
[ OK ] enwik8.xwrt: 44480481 -> 25269086 (56.8094%) (BROTLI9); 131.654 secs
[ OK ] enwik8.xwrt: 44480481 -> 20475765 (46.0331%) (ZPAQ); 343.908 secs

// XWRT 3.2 (filter) - XML compressor by P.Skibinski, inikep@gmail.com
~/> xwrt.exe enwik8
- encoding enwik8 to enwik8.xwrt (zlib normal)
warning: dictionary too big, you can use -b option to increase buffer size
 + dynamic dictionary 351236/524288 words
 + loaded dictionary 90403/143962 words
 + dynamic dictionary 90403/143962 words
 + encoding finished (100000000->26933107 bytes, 2.155 bpc) in 11.28s (8657 kb/s)
r-lyeh-archived commented 8 years ago

being the current status as:

bundler: Bundler 2.0.9 (RELEASE). Compiled on Oct 29 2015 - https://github.com/r-lyeh/bundler

~/> bundler p enwik8.lz4 enwik8 -u lz4
[ OK ] enwik8: 100000000 -> 42196883 (42.1969%) (LZ4); 7.547 secs

~/> bundler p enwik8.lz4f enwik8 -u lz4f
[ OK ] enwik8: 100000000 -> 56973113 (56.9731%) (LZ4F); 2.511 secs

~/> bundler p enwik8.deflate enwik8 -u deflate
[ OK ] enwik8: 100000000 -> 36459842 (36.4598%) (MINIZ); 12.493 secs

~/> bundler p enwik8.lzma20 enwik8 -u lzma20
[ OK ] enwik8: 100000000 -> 28686662 (28.6867%) (LZMA20); 63.419 secs

~/> bundler p enwik8.lzma25 enwik8 -u lzma25
[ OK ] enwik8: 100000000 -> 25395641 (25.3956%) (LZMA25); 114.631 secs

~/> bundler p enwik8.lzip enwik8 -u lzip
[ OK ] enwik8: 100000000 -> 26517401 (26.5174%) (LZIP); 93.672 secs

~/> bundler p enwik8.zstd enwik8 -u zstd
[ OK ] enwik8: 100000000 -> 39591123 (39.5911%) (ZSTD); 3.53 secs

~/> bundler p enwik8.shoco enwik8 -u shoco
[ OK ] enwik8: 100000000 -> 77457446 (77.4574%) (SHOCO); 4.379 secs

~/> bundler p enwik8.shrinker enwik8 -u shrinker
[ OK ] enwik8: 100000000 -> 51493030 (51.493%) (SHRINKER); 2.959 secs

~/> bundler p enwik8.bsc enwik8 -u bsc
[ OK ] enwik8: 100000000 -> 20786210 (20.7862%) (BSC); 41.022 secs

~/> bundler p enwik8.csc20 enwik8 -u csc20
[ OK ] enwik8: 100000000 -> 25135368 (25.1354%) (CSC20); 54.271 secs

~/> bundler p enwik8.brotli9 enwik8 -u brotli9
[ OK ] enwik8: 100000000 -> 29694839 (29.6948%) (BROTLI9); 113.727 secs

~/> bundler p enwik8.brotli11 enwik8 -u brotli11
[ OK ] enwik8: 100000000 -> 27131632 (27.1316%) (BROTLI11); 864.711 secs

~/> bundler p enwik8.zpaq enwik8 -u zpaq
[ OK ] enwik8: 100000000 -> 19448625 (19.4486%) (ZPAQ); 722.504 secs