madler / zlib

A massively spiffy yet delicately unobtrusive compression library.
http://zlib.net/
Other
5.65k stars 2.44k forks source link

deflateBound() returns incorrect result for gzip wrapped stream when called after deflate() #944

Closed zmodem closed 7 months ago

zmodem commented 7 months ago

Consider the following test program which calls deflateBound() after deflate() has run (and finished):

#include <assert.h>
#include <stdio.h>
#include "zlib.h"

void test(const char *src, size_t src_size) {
  const int level = 9;
  const int windowBits = 15 + 16; /* 16 for gzip wrapper */
  const int memLevel = 9;
  const int strategy = Z_DEFAULT_STRATEGY;
  z_stream stream;
  int ret;
  unsigned char out[100000]; 
  size_t out_size, deflate_bound;

  stream.zalloc = Z_NULL;
  stream.zfree = Z_NULL;
  ret = deflateInit2(&stream, level, Z_DEFLATED, windowBits, memLevel, strategy);
  assert(ret == Z_OK);

  stream.next_in = (unsigned char*)src;
  stream.avail_in = src_size;
  stream.next_out = out;
  stream.avail_out = sizeof(out);
  ret = deflate(&stream, Z_FINISH);
  assert(ret == Z_STREAM_END);
  out_size = sizeof(out) - stream.avail_out;

  deflate_bound = deflateBound(&stream, src_size);
  if (out_size <= deflate_bound) {
    printf("OK\n");
  } else {
    printf("Error!\n");
    printf("deflate_bound: %zu\n", deflate_bound);
    printf("out_size:      %zu\n", out_size);
  }

  deflateEnd(&stream);
}   

int main() {
  test("", 0); 
  return 0;
}

The returned bound is lower than the actual output size:

$ clang /tmp/deflatebound_after_deflate.c adler32.c crc32.c deflate.c trees.c zutil.c && ./a.out
Error!
deflate_bound: 10
out_size:      20

Calling it like that may seem unusual, but the manual doesn't forbid it:

[deflateBound()] must be called after deflateInit() or deflateInit2().

and the code looks like it's supposed to handle it, for example with this state check:

https://github.com/madler/zlib/blob/5c42a230b7b468dff011f444161c0145b5efae59/deflate.c#L849-L851

In the test program above, it seems the code fails to figure out what wrapper the stream is using, and we end up in this default case:

https://github.com/madler/zlib/blob/5c42a230b7b468dff011f444161c0145b5efae59/deflate.c#L884-L885

The "compiler happiness" comment indicates that perhaps that code is not intended to run?

That code (and also the code above if deflateStateCheck() fails) seems to assume a zlib wrapper, which is smaller than a gzip one.

So there seems to be two issues:

  1. The code fails to figure out the wrapper type when deflation has finished, but still doesn't take the "if can't get parameters" early exit
  2. When it can't figure out the wrapper type, it defaults to the zlib wrapper size which is too small

(Tested at the current develop branch, 5c42a230b7b468dff011f444161c0145b5efae59)

madler commented 7 months ago

That is not a permitted use. The stream cannot be used after Z_STREAM_END, and the documentation in zlib says so:

After deflate has returned Z_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd.

deflateBound() returns 22 in your case after initialization, after any deflate() call that does not return Z_STREAM_END, and after a deflateReset() that follows the Z_STREAM_END. So it is working as intended.

Also it is a non-sensical use. There is no value in the information if the stream cannot even be used. There is no use for the information if the compression is done, and you already know the actual length, as opposed to a bound. If you intend to reuse the stream, then you would get the bound after the reset.

madler commented 7 months ago

Having said all that, I have updated the code to be more conservative.