pfalcon / uzlib

Radically unbloated DEFLATE/zlib/gzip compression/decompression library. Can decompress any gzip/zlib data, and offers simplified compressor which produces gzip-compatible output, while requiring much less resources (and providing less compression ratio of course).
Other
303 stars 82 forks source link

Issue with decompressing data with original size 73080 bytes #17

Closed ragkousism closed 6 years ago

ragkousism commented 6 years ago

I have a gzip data array in C which has original size 73080 and compressed 34703. Based on your examples I wrote the following code to decompress the file

#define U32 uint32_t
#define U16 uint16_t
#define U8 uint8_t
#define OUT_CHUNK_SIZE 1

int gzip_uncompress(U8 *dest,
                    U32 *destLen,
                    const U8 *source,
                    U32 sourceLen)
{
  U32 dlen = 0;
  U32 gz_len = 0;
  TINF_DATA decomp;
  int res = 0;
  /* Decompress */
  uzlib_init();

  /* Length of gz file or buffer in our case */
  gz_len = sourceLen;

  /* /\* Take the decompressed length from the last bytes *\/ */
  /* dlen = (buffer[len - 1] << 24) | (buffer[len - 2] << 16) | (buffer[len - 3] << 8) | buffer[len - 4]; */
  dlen =            source[gz_len - 1];
  dlen = 256*dlen + source[gz_len - 2];
  dlen = 256*dlen + source[gz_len - 3];
  dlen = 256*dlen + source[gz_len - 4];

  *destLen = dlen;
  /* Out length must always be +1 for the decompression to work properly */
  dlen++;
  dest = calloc(dlen, sizeof(U8));
  if (dest == NULL)
    {
      printf("Error allocating memory\n");
      return -1;
    }

  uzlib_uncompress_init(&decomp, NULL, 0);

  decomp.source = source;
  decomp.source_limit = source + gz_len - 4;

  res = uzlib_gzip_parse_header(&decomp);
  if (res != TINF_OK)
    {
      printf("Error parsing header: %d\n", res);
      return -1;
    }

  decomp.destStart = decomp.dest = dest;
  /* Decompress byte by byte */
  /* decomp.destSize = 1; */

  while (dlen)
    {
      unsigned int chunk_len = dlen < OUT_CHUNK_SIZE ? dlen : OUT_CHUNK_SIZE;
      decomp.destSize = chunk_len;
      res = uzlib_uncompress_chksum(&decomp);
      dlen -= chunk_len;
      if (res != TINF_OK)
        {
          break;
        }
    }

  if (res != TINF_DONE)
    {
      printf("Error during decompression: %d\n", res);
      return -res;
    }
}

int main() 
{
...
gzip_uncompress (dest_buf,
                       &dest_buf_size,
                       source_buf,
                       source_buf_size);
...
}

The problem is that one KB before the end, uzlib_uncompress exits with TINF_DONE and the crc check obviously fails. I wrote a test with exactly the same code on a smaller file and it woks without an issue. Is there some upper limit on the files the library can decompress? Any ideas what I am doing wrong?

ragkousism commented 6 years ago

Forgot to add the gzip in question

file.gz

pfalcon commented 6 years ago

An obvious test is running this file thru the included "tgunzip" util. With the current master, 35e9c235da600cb562f71b73973eadd5ff45a04f, the result is the file of size 73080 with md5sum of 331c99fc2898a4f360a790d45989084d - the same as when decompressing with gunzip.

Apparently, your code has some bug. You would need to compare it with tgunzip's code and see what's wrong.

ragkousism commented 6 years ago

Hello @pfalcon yes I did try with tgunzip and it worked. The problem was not the code above. The code is correct. The issue is that before calling gzip_uncompress() the buffer I was writing the file, was 20 bytes sorter and as a result I was passing an invalid size to the library. The strange thing is that it didn't segfault on normal operation but only if I ran it with valgrind. Anyhow the case has nothing to do with the library. I am sorry for the inconvenience.