haskell / zlib

Compression and decompression in the gzip and zlib formats
http://hackage.haskell.org/package/zlib
35 stars 32 forks source link

Add support for flushing compression streams #6

Open hvr opened 8 years ago

hvr commented 8 years ago

At the API level this can be done by e.g. providing an additional field in CompressInputRequired (this is what e.g. https://github.com/hvr/lzma does):

diff --git a/Codec/Compression/Zlib/Internal.hs b/Codec/Compression/Zlib/Internal.hs
index 74519c7..d5b851b 100644
--- a/Codec/Compression/Zlib/Internal.hs
+++ b/Codec/Compression/Zlib/Internal.hs
@@ -371,6 +371,7 @@ foldDecompressStreamWithInput chunk end err = \s lbs ->
 --
 data CompressStream m =
      CompressInputRequired {
+         compressFlush       :: m (CompressStream m),
          compressSupplyInput :: S.ByteString -> m (CompressStream m)
        }

which then results in a Z_SYNC_FLUSH being requested from zlib, and resulting in the CompressOutputAvailable state being returned until all input data has been flushed out as compressed data.

Quoting the zlib documentation on deflate(_ ,Z_SYNC_FLUSH):

If the parameter flush is set to Z_SYNC_FLUSH, all pending output is flushed to the output buffer and the output is aligned on a byte boundary, so that the decompressor can get all input data available so far. (In particular avail_in is zero after the call if enough output space has been provided before the call.) Flushing may degrade compression for some compression algorithms and so it should be used only when necessary. This completes the current deflate block and follows it with an empty stored block that is three bits plus filler bits to the next byte, followed by four bytes (00 00 ff ff).

This would allow libraries such as io-streams to migrate from zlib-bindings to zlib

lpsmith commented 8 years ago

Hmm, looking at zlib, considering the fact that it includes a copy of the library and then only exposes a only part of the functionality in a higher-level interface, wouldn't it make sense to also include a low-level interface that directly corresponds to C calls and that exposes most or all of the API?

tolysz commented 7 years ago

With this change, I can create streams which always Z_SYNC_FLUSH... https://github.com/haskell/zlib/pull/10