BioJulia / Libz.jl

Fast, flexible zlib bindings.
Other
27 stars 17 forks source link

Mutually exclusive stream types #50

Open Hydrotoast opened 7 years ago

Hydrotoast commented 7 years ago

It seems that there are three mutually exclusive stream types: zlib, gzip, and raw. Currently, the default presumption is that the stream type is zlib-wrapped and the latter two are special cases which can be handled by very specific combinations of flags on the ZlibXYStream type where X is either Inflate or Deflate and Y is either Input or Output.

I think the following types would expose such cases better.

ZlibXYStream
GzipXYStream
RawXYStream
bicycle1885 commented 7 years ago

I shot myself in the foot yesterday due to the confusion of different compression formats in Libz.jl. I think your suggestion is reasonable and we should reconsider the APIs of Libz.jl.

Hydrotoast commented 7 years ago

@bicycle1885 I found the Zlib manual to be helpful for pointing out the three different formats that Zlib supports:

The compressed data format used by default by the in-memory functions is the zlib format, which is a zlib wrapper documented in RFC 1950, wrapped around a deflate stream, which is itself documented in RFC 1951.

The library also supports reading and writing files in gzip (.gz) format with an interface similar to that of stdio using the functions that start with "gz". The gzip format is different from the zlib format. gzip is a gzip wrapper, documented in RFC 1952, wrapped around a deflate stream.

This library can optionally read and write gzip and raw deflate streams in memory as well.

In general, there are three types of wrapper formats with some aliases: DEFLATE (aka Zlib), GZIP, and raw deflate (aka raw).

Node.js' Zlib provides the following types:

Deflate
DeflateRaw
Gunzip
Gzip
Inflate
InflateRaw
Unzip // special type that reads either gzip/zlib format based on byte header

Java uses the following pattern:

{Inflater,Deflater}{Input,Output}Stream

with GZIP{Input,Output}Stream as a subtype.

In both APIs, having GZIP as a separate type is common. In Node.js, the DeflateRaw format is also in it's own type. Java treats the raw deflate format as a special case of Deflate.