mxmlnkn / rapidgzip

Gzip Decompression and Random Access for Modern Multi-Core Machines
Apache License 2.0
344 stars 7 forks source link

Add parallelized compression #21

Open mxmlnkn opened 10 months ago

mxmlnkn commented 10 months ago

The most difficult part is probably the build system and packaging because currently only the files necessaryfor decompression from zlib and ISA-l are built and packaged.

Aside from that, simply split the data, send to the ThreadPool, compress and concatenate to the output in the right order.

In the first, iteration, I could simply write out similar to bgzip. In an improvement it would be nice to somehow store the whole index more easily accessible. Maybe similar to the format written by pgzf or gztool.

This would be nice for rapidgzip to be usable as a drop-in replacement for gzip.