kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
219 stars 76 forks source link

Bshuf_decompress_zstd_block consistent with bshuf_decompress_lz4_block #115

Closed fleon-psi closed 2 years ago

fleon-psi commented 2 years ago

There is currently wrong return value of bshuf_decompress_zstd_block leading to issues with decompression, simple compress/decompress test is giving wrong result.

Signed-off-by: Filip Leonarski filip.leonarski@psi.ch

jrs65 commented 2 years ago

@fleon-psi thanks! Do you know what conditions it needs for the compress/decompress cycle to fail? I would have thought this would be caught by the unit tests, so at the very least I'd like to modify/add them to catch this one.

jrs65 commented 2 years ago

Ah, I think I see why it's not caught. Our unit tests only test zstd via the HDF5 filter pipeline, but the filter pipeline doesn't check the return value from bshuf_compress_zstd_block as thoroughly as the decompress_zstd routine in bitshuffle.ext does.

fleon-psi commented 2 years ago

I use a following Catch2 unit test internally in my code and it triggers failure on REQUIRE(decomp_out == comp_out) assertion:

TEST_CASE("Bitshuffle_ZSTD","[ZSTD]") { std::vector image(5121024 sizeof(int32_t)); std::vector compressed(bshuf_compress_zstd_bound(5121024, 4, 0)); std::vector decompressed(5121024 sizeof(int32_t)); for (int i = 0; i < 5121024; i++) image[i] = i;

int64_t comp_out = bshuf_compress_zstd(image.data(), compressed.data(), RAW_MODULE_SIZE, 4, 0,0);
REQUIRE(comp_out > 0);

int64_t decomp_out = bshuf_decompress_zstd(compressed.data(), decompressed.data(), 512*1024, 4, 0);
REQUIRE(decomp_out == comp_out);

REQUIRE(memcmp(image.data(), decompressed.data(), 512*1024*sizeof(uint32_t)) == 0);

}

jrs65 commented 2 years ago

Okay, great. I've added some unit tests that catch this issue, so at least now we shouldn't have any regressions.

Thanks @fleon-psi !