HDFGroup / hdf5-json

Specification and tools for representing HDF5 in JSON
https://hdf5-json.readthedocs.io
Other
72 stars 25 forks source link

Fix bug with UTF-8 decoding and automate versioning #72

Closed anabiman closed 2 years ago

anabiman commented 2 years ago

This PR:

The bug referenced in the 1st point can be reproduced via:

import h5py

with h5py.File('foo.hdf5', "w") as fp:
    dt = h5py.string_dtype(encoding='utf-8')
    ds = fp.create_dataset("foo", data=["A", "B", "C"], dtype=dt)

Converting foo.hdf5 to JSON yields an error since the array of strings does not get decoded (it is passed as a list of bytes instead). Note that for encoding="ascii", the conversion works (since in this case Hdf5db.bytesArrayToList is called, which decodes the byte strings).

ajelenak commented 2 years ago

Hi @andrew-abimansour, thank you very much for the PR! I somehow missed the notice, I apologize for such a long delay to merge.