BioJulia / Libz.jl

Fast, flexible zlib bindings.
Other
27 stars 17 forks source link

Quick question on usage #3

Closed quinnj closed 8 years ago

quinnj commented 9 years ago

Hey @dcjones, quick question on how to do what I want to do:

Obviously my process above has some inefficiencies, particularly because nothing is in-memory or buffered. I don't think I can get around having to inflate, then deflate though so that I make sure to send the file in line-based chunks. I think I want to do something like

Any tips?

dcjones commented 9 years ago

Hmm interesting use case. I think something like this should do the trick.

using Libz, BufferedStreams

input = ZlibInflateInputStream(open(filename))
output_buffer = BufferedOutputStream()
output_stream = ZlibDeflateOutputStream(output_buffer)
block_size = 100000000

bytes_read = 0
for line in eachline(input)
    write(output_stream, line)
    bytes_read += length(line)

    if bytes_read > block_size
        close(output_stream)
        block = takebuf_array(output_buffer)
        # TODO: do something with block

        # open a new stream for the next block
        output_stream = ZlibDeflateOutputStream(output_buffer)
        bytes_read = 0
    end
end

# flush remaining data
close(output_stream)
block = takebuf_array(output_buffer)
# TODO: do something with block
dcjones commented 9 years ago

There's not a built in way to track the number of bytes written to an output stream currently, hence keeping track manually with bytes_read. I think that could be solved by implementing position() on BufferedOutputStream.

quinnj commented 8 years ago

Thanks BTW; this package is working great for me.