Closed marklam closed 2 months ago
The error was not detected earlier because I only had tests for the hardware-accelerated version of the shuffle algorithm for the type sizes = 1, 2, 4 and 8 bytes. When using variable-length data, the actual data is stored in the global heap (without compression!). The data which is being compressed are the references (= pointers) for all the objects in the global heap. These references have a size of 16 byte and I did not have tests for this type size. It is important to test this case because for 16 bytes a different shuffle algorithm is being used. I did not have tests for this type size because previously I did not know how create proper test data using the original HDF5 library. I found a solution to that problem and have added some tests.
The actual bug was caused by the fact that the hardware-accelerated shuffle implementation is being auto-translated from C to C# and the code for that is being taken from the Blosc2 repository. The hardware-accelerated version of the function Avx2.Shuffle
in the C code requires a reversed shuffle mask (as per comment in the file in the Blosc2 repo). And the C# function Avx2.Shuffle
requires the shuffle mask not being reversed. The auto-translation procedure of the C code did not reversed the shuffle mask array and then the shuffle function produced garbage.
Here is the double-reversed shuffle mask for 16 byte types which should work fine now:
I fear that for data types > 16 bytes the error persists because for this kind of data yet another algorithm with a different shuffle mask is being used. But I have again the problem that I do not know how to create proper test data. I created issue #75 to cover this.
PureHDF 1.0.0-beta.11 should solve the error for you :-) I will try to have a look into the other issue tomorrow.
I've updated my repro-repo to demonstrate:
https://github.com/marklam/Roundtrip3DArrayOfStructList/tree/01847ceae628323832465e8a94841c1b4cab4286
It seems that if the dataset is created with the shuffle filter enabled (it doesn't need to be compressed), then the data read back is misaligned.