Apollo3zehn / PureHDF

A pure .NET library that makes reading and writing of HDF5 files (groups, datasets, attributes, ...) very easy.
MIT License
47 stars 16 forks source link

Write Opaque Datatype #76

Closed Blackclaws closed 2 months ago

Blackclaws commented 2 months ago

I have the need to add a byte[] as an OPAQUE dataset so it can be viewed as an image in h5web.

Is there a way to currently write opaque datasets using PureHDF?

Apollo3zehn commented 2 months ago

Hi, no it is not yet possible. Mainly because of lack of use cases. Do you want to write a single opaque value (scalar) or an array of opaque values?

Blackclaws commented 2 months ago

I'm currently writing a JPG image as a dataset and trying to visualize it with h5web which only tries to do so for opaque datasets. I've also requested that they might implement a different way to visualize a dataset as an image.

https://github.com/silx-kit/h5web/issues/1623

Apollo3zehn commented 2 months ago

Thanks, I`ll have a look into this today evening

Apollo3zehn commented 2 months ago

Support for opaque has been added now for v1.0.0-beta.13:

var data = File.ReadAllBytes("/home/vincent/Downloads/img.jpg");

var opaqueInfo = new H5OpaqueInfo(
    TypeSize: (uint)data.Length,
    Tag: "My tag"
);

var file = new H5File
{
    ["opaque"] = new H5Dataset(data, opaqueInfo: opaqueInfo)
};

file.Write("/home/vincent/Downloads/testimg.h5");
Blackclaws commented 2 months ago

Thanks a lot for the very quick turnaround on this :)

Blackclaws commented 2 months ago

@Apollo3zehn So with larger opaque datasets I get the following error from h5web:

HDF5-DIAG: Error detected in HDF5 (1.14.2) thread 0: #000: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5D.c line 403 in H5Dopen2(): unable to synchronously open dataset major: Dataset minor: Can't open object #001: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5D.c line 364 in H5D__open_api_common(): unable to open dataset major: Dataset minor: Can't open object #002: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 1980 in H5VL_dataset_open(): dataset open failed major: Virtual Object Layer minor: Can't open object #003: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 1947 in H5VL__dataset_open(): dataset open failed major: Virtual Object Layer minor: Can't open object #004: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLnative_dataset.c line 321 in H5VL__native_dataset_open(): unable to open dataset major: Dataset minor: Can't open object #005: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dint.c line 1429 in H5D__open_name(): can't open dataset major: Dataset minor: Unable to initialize object #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dint.c line 1494 in H5D_open(): not found major: Dataset minor: Object not found #007: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dint.c line 1756 in H5D__open_oid(): can't retrieve message major: Dataset minor: Can't get value #008: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Omessage.c line 432 in H5O_msg_read(): unable to read object header message major: Object header minor: Read failed #009: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Omessage.c line 487 in H5O_msg_read_oh(): unable to decode message major: Object header minor: Unable to decode value #010: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Oshared.h line 74 in H5O__fill_new_shared_decode(): unable to decode native message major: Object header minor: Unable to decode value #011: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Ofill.c line 291 in H5O__fill_new_decode(): ran off end of input buffer while decoding major: Object header minor: Address overflowed

I'm not sure whether the issue comes from the way the hdf5 is encoded on PureHDFs side or from h5web.

I've tried 1Mb images here.

h5py also isn't happy with it:

>>> v = file["group"]["opaque"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/felix/.cache/pypoetry/virtualenvs/net8-0-xM1W9L25-py3.11/lib/python3.11/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 241, in h5py.h5o.open
KeyError: 'Unable to synchronously open object (ran off end of input buffer while decoding)'
Blackclaws commented 2 months ago

Two more problems I've encountered where I'm not sure whether it is a PureHDF or h5web problem:

var data = File.ReadAllBytes("/home/felix/screenshot.png");

var file = new H5File();
file["opaque"] = new H5Dataset(data, opaqueInfo: new H5OpaqueInfo((uint)data.Length, "test")); 

var opaqueGroup = new H5Group()
{
    Attributes =
    {
        {"Anything", "Data"}
    },
    ["stuff"] = new H5Group()
    {
     ["opaque"] = new H5Dataset(data, opaqueInfo: new H5OpaqueInfo((uint) data.Length, "test"))   
    }
};

file["group"] = opaqueGroup;

file.Write("testing.h5");

Writing this file (with a screenshot.png that is small enough that it doesn't directly break) one gets an error on opening the top group in h5web

testing.h5

group

stuff

opaque

opaque

group

Display

Inspect
HDF5-DIAG: Error detected in HDF5 (1.14.2) thread 0: #000: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5A.c line 1043 in H5Aread(): can't synchronously read data major: Attribute minor: Read failed #001: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5A.c line 1011 in H5A__read_api_common(): unable to read attribute major: Attribute minor: Read failed #002: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 1236 in H5VL_attr_read(): attribute read failed major: Virtual Object Layer minor: Read failed #003: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 1205 in H5VL__attr_read(): attribute read failed major: Virtual Object Layer minor: Read failed #004: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLnative_attr.c line 211 in H5VL__native_attr_read(): unable to read attribute major: Attribute minor: Read failed #005: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Aint.c line 754 in H5A__read(): datatype conversion failed major: Attribute minor: Unable to encode value #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5T.c line 5308 in H5T_convert(): datatype conversion failed major: Datatype minor: Can't convert datatypes #007: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Tconv.c line 3326 in H5T__conv_vlen(): can't read VL data major: Datatype minor: Read failed #008: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Tvlen.c line 840 in H5T__vlen_disk_read(): unable to get blob major: Datatype minor: Can't get value #009: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 7396 in H5VL_blob_get(): blob get failed major: Virtual Object Layer minor: Can't get value #010: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 7367 in H5VL__blob_get(): blob get callback failed major: Virtual Object Layer minor: Can't get value #011: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLnative_blob.c line 123 in H5VL__native_blob_get(): Expected global heap object size does not match major: Virtual Object Layer minor: Unable to decode value

When leaving out the top opaque dataset:


var data = File.ReadAllBytes("/home/felix/screenshot.png");

var file = new H5File();

var opaqueGroup = new H5Group()
{
    Attributes =
    {
        {"Anything", "Data"}
    },
    ["stuff"] = new H5Group()
    {
     ["opaque"] = new H5Dataset(data, opaqueInfo: new H5OpaqueInfo((uint) data.Length, "test"))   
    }
};

file["group"] = opaqueGroup;

file.Write("testing.h5");

The dataset does not get written as opaque instead it gets written as Integer (unsigned), 8-bit, little-endian