libnxz / power-gzip

POWER NX zlib compliant library
23 stars 18 forks source link

Add `inflateSyncPoint` #90

Closed mscastanho closed 3 years ago

mscastanho commented 3 years ago

This is an RFC for a libnxz version of inflateSyncPoint (issue #77). Please read the details in the commit messages.

Some current issues:

Thoughts?

Fixes #77

abalib commented 3 years ago

To test the other cases I'd have to find a way to construct an inflate stream where the 0-length literal block started at different bit offsets, but I haven't found a way to do so. Thoughts?

If I understood your goal correctly, here is how to construct such a stream--I think. Embed some arbitrary large text in your test code, or read from a file. Compress it piecewise: say calling deflate every 256 bytes of input. Do a zlib deflate( ... , Z_SYNC_FLUSH) for each piece, also accumulating the compressed stream in a buffer in memory. Deflate will output some number of bits each time ending on an arbitrary bit boundary depending on the compressibility of the input. The output stream when returning from deflate should also contain an empty 0-length block (i.e., because of the Z_sync_flush). The padding will be there naturally between the misaligned last data bit and the 0-length block.
Also record the deflate output lengths in an int array. It tells you the end offset of each 0-length block.

Now, feed the same stream in memory buffer to inflate() (either zlib or libnxz which ever is the one your testing). But again piecewise only feeding whatever your int array tells you per inflate call. If you got the lengths right, each piece should contain an empty 0-length block at the ned. Therefore you can test inflateSync at that point. How many trials/piece do you need to cover all 8 alignments? (7/8)^34 gives me about 1 percent probability of not having one of 8 alignments. i.e. you can send about 50 x 256 byte pieces each to deflate which nearly guarantees testing all possible alignments (0.1% chance of missing one alignment).

I might have missed something here and there but this sketch is the best I can come with

mscastanho commented 3 years ago

@abalib Thanks for the suggestion! That worked perfectly. I updated the test to take that approach. By running it in verbose mode I can see that now all possible padding amounts are being checked with high probability.

mscastanho commented 3 years ago

GCC 8 was complaining about the var declaration inside a case:. Fixed now.

mscastanho commented 3 years ago

Also added test_inflatesyncpoint to .gitignore.