SpringMT / zstd-ruby

Ruby binding for zstd(Zstandard - Fast real-time compression algorithm)
https://github.com/facebook/zstd
BSD 3-Clause "New" or "Revised" License
69 stars 16 forks source link

Implements handling of partial frames by the streaming decompressor #88

Open gucki opened 6 months ago

gucki commented 6 months ago

Fixes https://github.com/SpringMT/zstd-ruby/issues/87

Example use case:

read_buffer = "".force_encoding(Encoding::ASCII_8BIT)
loop do
  logger.debug("read partial")
  buf = reader.readpartial(64 * 1024)
  read_buffer << buf
  logger.debug("read #{buf.bytesize} bytes (#{read_buffer.bytesize} bytes buffered)")

  while !read_buffer.empty?
    result, data, read_buffer_pos = decompressor.decompress2(read_buffer)
    read_buffer.slice!(0, read_buffer_pos)
    logger.debug("decompressed #{data.bytesize} bytes (result #{result}, consumed #{read_buffer_pos} bytes, #{read_buffer.bytesize} left in read buffer)")
    writer.write(data)
    break if read_buffer_pos == 0
  end
end

It works according to the zstd streaming decompression documentation:

Use ZSTD_decompressStream() repetitively to consume your input.
The function will update both `pos` fields.
If `input.pos < input.size`, some input has not been consumed.
It's up to the caller to present again remaining data.
The function tries to flush all data decoded immediately, respecting output buffer size.
If `output.pos < output.size`, decoder has flushed everything it could.
But if `output.pos == output.size`, there might be some data left within internal buffers.,
In which case, call ZSTD_decompressStream() again to flush whatever remains in the buffer.
Note : with no additional input provided, amount of data flushed is necessarily <= ZSTD_BLOCKSIZE_MAX.
@return : 0 when a frame is completely decoded and fully flushed,
      or an error code, which can be tested using ZSTD_isError(),
      or any other value > 0, which means there is still some decoding or flushing to do to complete current frame :
                              the return value is a suggested next input size (just a hint for better latency)
                              that will never request more than the remaining frame size.
SpringMT commented 6 months ago

Thank you for the PR. I'll check it later.