moinakg / pcompress

A Parallelized Data Deduplication and Compression utility
http://moinakg.github.com/pcompress/
GNU Lesser General Public License v3.0
278 stars 34 forks source link

Enabling zpaq compression #33

Closed szepeviktor closed 9 years ago

szepeviktor commented 9 years ago

I think lrzip includes zpaq compression.

content original pcompress -l 14 zpaq -method 59
WordPress 18 779 798 4 522 323 4 078 100
some SQL 1 553 755 163 331 146 017

Could you enable it and add options for it?

moinakg commented 9 years ago

Zpaq is a different compression and archival toolkit with it's own journaling file format. It uses PAQ derived neural network algorithms to compress. It also has it's own language called ZPAQL to encode compression algorithms in a platform-independent way along with a JIT engine. As such, it has some features not present in Pcompress. Also, ZPAQ has slower modes that can give extreme compression at the cost of speed. With Pcompress, my objective is to use classical algorithms along with data analysis and pre-processing to get the maximum compression possible, without sacrificing too much speed. Also the objective is to leverage parallelism as much as possible. Pcompress can scale to an arbitrary number of cores if there is enough data to compress.

So Zpaq is a stand alone, complete utility of it's own and is not possible to combine with Pcompress.

szepeviktor commented 9 years ago

Thank you for your answer.