cmcqueen / lzs-compression

Compression using LZS-style algorithm, derived from LZ77-style
MIT License
19 stars 8 forks source link

Decompression output is not complete when using incremental #6

Open orangefour opened 6 years ago

orangefour commented 6 years ago

This test compresses and decompresses text by using incremental versions of the functions.

static void compress_decompress_incremental()
{
    const char* in_buffer = "Return a string containing a printable representation of an object. "
      "For many types, this function makes an attempt to return a string that would yield an "
      "object with the same value when passed to eval(), otherwise the representation is a "
      "string enclosed in angle brackets that contains the name of the type of the object "
      "together with additional information often including the name and address of the object.";

    uint8_t out_buffer[1024];
    LzsCompressParameters_t     compress_params;
    lzs_compress_init(&compress_params);

    compress_params.inPtr = (const uint8_t*)in_buffer;
    compress_params.inLength = strlen(in_buffer);
    compress_params.outPtr = out_buffer;
    compress_params.outLength = sizeof(out_buffer);

    size_t out_size = lzs_compress_incremental(&compress_params, true);

    char dec_buffer[1024];

    LzsDecompressParameters_t   decompress_params;
    lzs_decompress_init(&decompress_params);

    decompress_params.inPtr = out_buffer;
    decompress_params.inLength = out_size;
    decompress_params.outPtr = (uint8_t*)dec_buffer;
    decompress_params.outLength = sizeof(dec_buffer);

    size_t dec_size = lzs_decompress_incremental(&decompress_params);
    //dec_buffer[dec_size] = 0;

    printf("Decompressed data \n%s\n", dec_buffer);
}

Unfortunately after decompression, the last 3 words (of the object.) are missing.

cmcqueen commented 6 years ago

After calling lzs_compress_incremental(), examine which bit-flags are set in compress_params.status (refer to LzsCompressStatus_t). That will give you insight into the compression status.

Likewise for decompression, after calling lzs_decompress_incremental(), examine decompress_params.status bit-flags.

cmcqueen commented 6 years ago

In this case, the call to lzs_compress_incremental() returns with compress_params.status set to LZS_C_STATUS_INPUT_FINISHED | LZS_C_STATUS_INPUT_STARVED, indicating that all the input is used up. It is necessary to call the function a second time (again with the 2nd parameter set to true), which flushes remaining data and sets the LZS_C_STATUS_END_MARKER bit in the status.

cmcqueen commented 6 years ago

Please see the branch issue-6 for example.

orangefour commented 6 years ago

Thank you for explanation and example! I played around and noticed sometimes I need to call lzs_compress_incremental() with 2nd parameter set to true multiple times before all the data gets flushed. This might not be what the average user expects (yeah I am just nagging 😁)

cmcqueen commented 6 years ago

I might be able to modify the function so that it completes outputting the end-marker in the first call to the function. I'll see what I can do.