richgel999 / lzham_codec

Lossless data compression codec with LZMA-like ratios but 1.5x-8x faster decompression speed, C/C++
Other
693 stars 71 forks source link

decompress does not immediately write data after a flush #4

Closed nemequ closed 9 years ago

nemequ commented 9 years ago

When I flush a compression stream and send the output to a decompression stream the decompression stream just keeps the data in its buffer instead of immediately making it available (like zlib does), which pretty much destroys the usefulness of flushing.

Test case:

#define LOREM_IPSUM                                                     \
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed vulputate " \
  "lectus nisl, vitae ultricies justo dictum nec. Vestibulum ante ipsum " \
  "primis in faucibus orci luctus et ultrices posuere cubilia Curae; "  \
  "Suspendisse suscipit quam a lectus adipiscing, sed tempor purus "    \
  "cursus. Vivamus id nulla eget elit eleifend molestie. Integer "      \
  "sollicitudin lorem enim, eu eleifend orci facilisis sed. Pellentesque " \
  "sodales luctus enim vel viverra. Cras interdum vel nisl in "         \
  "facilisis. Curabitur sollicitudin tortor vel congue "                \
  "auctor. Suspendisse egestas orci vitae neque placerat blandit.\n"    \
  "\n"                                                                  \
  "Aenean sed nisl ultricies, vulputate lorem a, suscipit nulla. Donec " \
  "egestas volutpat neque a eleifend. Nullam porta semper "             \
  "nunc. Pellentesque adipiscing molestie magna, quis pulvinar metus "  \
  "gravida sit amet. Vestibulum mollis et sapien eu posuere. Quisque "  \
  "tristique dignissim ante et aliquet. Phasellus vulputate condimentum " \
  "nulla in vulputate.\n"                                               \
  "\n"                                                                  \
  "Nullam volutpat tellus at nisi auctor, vitae mattis nibh viverra. Nunc " \
  "vitae lectus tristique, ultrices nibh quis, lobortis elit. Curabitur " \
  "at vestibulum nisi, nec facilisis ante. Nulla pharetra blandit lacus, " \
  "at sodales nulla placerat eget. Nulla congue varius tortor, sit amet " \
  "tempor est mattis nec. Praesent vitae tristique ipsum, rhoncus "     \
  "tristique lorem. Sed et erat tristique ligula accumsan fringilla eu in " \
  "urna. Donec dapibus hendrerit neque nec venenatis. In euismod sapien " \
  "ipsum, auctor consectetur mi dapibus hendrerit.\n"                   \
  "\n"                                                                  \
  "Phasellus sagittis rutrum velit, in sodales nibh imperdiet a. Integer " \
  "vitae arcu blandit nibh laoreet scelerisque eu sit amet eros. Aenean " \
  "odio felis, aliquam in eros at, ornare luctus magna. In semper "     \
  "tincidunt nunc, sollicitudin gravida nunc laoreet eu. Cras eu tempor " \
  "sapien, ut dignissim elit. Proin eleifend arcu tempus, semper erat et, " \
  "accumsan erat. Praesent vulputate diam mi, eget mollis leo "         \
  "pellentesque eget. Aliquam eu tortor posuere, posuere velit sed, "   \
  "suscipit eros. Nam eu leo vitae mauris condimentum lobortis non quis " \
  "mauris. Nulla venenatis fringilla urna nec venenatis. Nam eget velit " \
  "nulla. Proin ut malesuada felis. Suspendisse vitae nunc neque. Donec " \
  "faucibus tempor lacinia. Vivamus ac vulputate sapien, eget lacinia " \
  "nisl.\n"                                                             \
  "\n"                                                                  \
  "Curabitur eu dolor molestie, ullamcorper lorem quis, egestas "       \
  "urna. Suspendisse in arcu sed justo blandit condimentum. Ut auctor, " \
  "sem quis condimentum mattis, est purus pulvinar elit, quis viverra " \
  "nibh metus ac diam. Etiam aliquet est eu dui fermentum consequat. Cras " \
  "auctor diam eget bibendum sagittis. Aenean elementum purus sit amet " \
  "sem euismod, non varius felis dictum. Aliquam tempus pharetra ante a " \
  "sagittis. Curabitur ut urna felis. Etiam sed vulputate nisi. Praesent " \
  "at libero eleifend, sagittis quam a, varius sapien."
#define LOREM_IPSUM_LENGTH 2725

#include <stdio.h>
#include <lzham.h>
#include <stdint.h>
#include <assert.h>

int main (int argc, char** argv) {
  size_t uncompressed_length = LOREM_IPSUM_LENGTH;
  uint8_t compressed[4096];
  size_t compressed_length = sizeof (compressed);
  uint8_t decompressed[LOREM_IPSUM_LENGTH * 2];
  size_t decompressed_length = sizeof (decompressed);
  size_t pos1, pos2;

  lzham_compress_state_ptr comp;
  lzham_compress_status_t comp_res;
  lzham_compress_params comp_params = {
    sizeof(lzham_compress_params),
    LZHAM_MAX_DICT_SIZE_LOG2_X86,
    LZHAM_COMP_LEVEL_DEFAULT,
    LZHAM_DEFAULT_TABLE_UPDATE_RATE,
    -1,
    0,
    0,
    NULL,
    0,
    0
  };

  lzham_decompress_state_ptr decomp;
  lzham_decompress_status_t decomp_res;
  lzham_decompress_params decomp_params = {
    sizeof(lzham_decompress_params),
    LZHAM_MAX_DICT_SIZE_LOG2_X86,
    LZHAM_DEFAULT_TABLE_UPDATE_RATE,
    0,
    0,
    NULL,
    0,
    0
  };

  const lzham_flush_t flush_or_finish = LZHAM_SYNC_FLUSH;
  // const lzham_flush_t flush_or_finish = LZHAM_FINISH;

  comp = lzham_compress_init (&comp_params);
  comp_res = lzham_compress2 (comp,
                  (const lzham_uint8*) LOREM_IPSUM, &uncompressed_length,
                  compressed, &compressed_length,
                  LZHAM_NO_FLUSH);

  assert (comp_res == LZHAM_COMP_STATUS_NEEDS_MORE_INPUT);
  assert (uncompressed_length == LOREM_IPSUM_LENGTH);
  assert (compressed_length == 0);

  pos1 = 0;
  pos2 = sizeof (compressed) - compressed_length;
  comp_res = lzham_compress2 (comp, (const lzham_uint8*) LOREM_IPSUM, &pos1,
                  compressed + compressed_length, &pos2,
                  flush_or_finish);

  compressed_length += pos2;

  assert ((comp_res == LZHAM_COMP_STATUS_NOT_FINISHED) || // For LZHAM_SYNC_FLUSH
      (comp_res == LZHAM_COMP_STATUS_SUCCESS));       // For LZHAM_FINISH
  assert (compressed_length > 0);

  decomp = lzham_decompress_init (&decomp_params);

  pos1 = compressed_length;
  decomp_res = lzham_decompress (decomp,
                 compressed, &pos1,
                 decompressed, &decompressed_length,
                 flush_or_finish == LZHAM_FINISH);

  assert (pos1 == compressed_length);

  // THIS FAILS
  assert (decompressed_length != 0);
  assert (decompressed_length == LOREM_IPSUM_LENGTH);

  lzham_compress_deinit (comp);
  lzham_decompress_deinit (decomp);

  return 0;
}
richgel999 commented 9 years ago

Thanks! I'll repro this tonight after work.

richgel999 commented 9 years ago

Ok, got the problem in the debugger.. crunching away on it.

Thanks a lot for these small repos, they are great!

richgel999 commented 9 years ago

I've added partial output buffer flushing to the decompressor - testing it now. It fixes your test case, but I want to bang on it a bit more before pushing the fixes.

nemequ commented 9 years ago

Excellent, thanks.

If you come up with any good tests for it please consider publishing them; I would definitely be interested in porting them to Squash.

richgel999 commented 9 years ago

Ok the fixes are checked in. I'll be testing them on Linux and Windows all night.

richgel999 commented 9 years ago

The zlib flushing support code is the least tested and the trickiest to get right. I'll add another test to exercise this code in both buffered and unbuffered mode.