libnxz / power-gzip

POWER NX zlib compliant library
23 stars 18 forks source link

Fix setDictionary functions #140

Closed mscastanho closed 2 years ago

mscastanho commented 2 years ago

The biggest fix needed to make these two functions work was to properly set history and dictionary sizes used by inflate. Both were being round up to 16 byte boundary, but that is not necessary. The requirement is that histLen is a multiple of 16, so we can always use the entire dictionary and add some padding bytes only to the history, making the length of history + dictionary a multiple of 16.

Other changes include some assorted changed to guarantee both functions have the same behavior as their zlib counterparts, code cleanup, and a new test to guarantee the functions keep working as expected.

Fixes #15

abalib commented 2 years ago

Quick comment: while compressing with a dictionary, you don't want to pad the dictionary with extra bytes. Instead, round histlen down to a size of 16. Because you don't want deflate pulling in bytes to the compressed payload that the user supplied dictionary does not have. Otherwise, inflate may complain later about missing bytes. About rounding down histlen: there is no requirement to use all the dictionary. If you not use to up to 15 dictionary bytes by rounding down, that is technically ok. Also when you round down dictionary length, it's probably better to lose bytes from the beginning of a dictionary than end. Because in zlib.h it says put the most important dictionary strings towards the end.

mscastanho commented 2 years ago

a test that compress with NX and decompress with zlib (like test_deflate)

@rzinsly this is already being tested by test_dict.mix2, but I don't think we have an equivalent for the other direction. Let me see what I can do.

mscastanho commented 2 years ago

@rzinsly I thought of a few ways to implement this test without creating a lot of duplicated code (like test_inflate and test_deflate):

  1. Split the current test_dict.c in three separate programs: dict_create, dict_deflate and dict_inflate. Then we could call dict_create once to generate a dictionary file and use in the other two. Then call dict_deflate and dict_inflate with different NX_GZIP_TYPE_SELECTOR value combinations to do what we want. Most code for this would reside in Makefile.am.
  2. Move test_setDictionary function to a separate file and wrap all zlib API calls with PREFIX1 and PREFIX2, one for inflate- other for deflate-related calls. Then we could #include this file in test_dict.c but setting different combinations of values for PREFIX1 and PREFIX2, which could be either sw_ or nx_, this way we could create 4 functions with the same code, but testing different libnxz and zlib combinations. This involves macro magic 🪄
  3. Implement a new type selector that compresses with zlib and decompresses with NX. Then adding the new test would be very trivial, and would help other tests as well. But I'm not sure this is something we are interested in supporting for our users.

Thoughts? Other suggestions are welcome as well.

rzinsly commented 2 years ago

@mscastanho In one of the modes (e.g. raw) you could call nx_inflate() and keep the normal deflate call, in this way test_dict.sw would test the zlib compress with nx decompress.

mscastanho commented 2 years ago

@rzinsly but wouldn't nx_inflate fallback to zlib's inflate since the type selector would be set to software? It wouldn't change much.

rzinsly commented 2 years ago

@rzinsly but wouldn't nx_inflate fallback to zlib's inflate since the type selector would be set to software? It wouldn't change much.

@mscastanho Not if you call the nx_ versions of all inflate related functions, nx_inflate() don't check the selector.

mscastanho commented 2 years ago

@rzinsly Done

mscastanho commented 2 years ago

@tuliom Fixed.