Closed Blackclaws closed 2 months ago
v1.0.0-beta.14
should fix both issues. This issue was caused by the fact that PureHDF - as a workaround to a HDF5 issue - always wrote the fill value into the file. This is not a problem for standard data types like double where it is only 8 bytes. But for an opaque data type with unlimited size, this fill value can become large. It effectively doubles the file size. I did not find it in the source code or the spec but I guess there is some limit for the maximum size of a fill value and that is why it worked up to a certain image size.
The other issue was caused by using an internal cache improperly. The base type of an opaque type is a byte array and for all types the type information are being cached (e.g. info about how to serialize the data to the file). But opaque data and byte[]
data must be treated differently. However both used the same cache entry and so the opaque data were serialized with the wrong type information when the cache entry already existed which is the case in your example because of that extra attribute.
I hope it works better now!
Definitely works much better now :) Thanks a lot for the quick fix. I can now use 20Mb + files as opaque datasets without any error.
I'm not sure whether the issue comes from the way the hdf5 is encoded on PureHDFs side or from h5web.
I've tried 1Mb images here.
h5py also isn't happy with it:
Originally posted by @Blackclaws in https://github.com/Apollo3zehn/PureHDF/issues/76#issuecomment-2077207473