HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
111 stars 39 forks source link

Creating a new dataset containing byte strings does not work w/ data param #53

Closed MRossol closed 1 year ago

MRossol commented 6 years ago

TypeError: Object of type ‘bytes’ is not JSON serializable

jbhatch commented 4 years ago

There is a similar issue with H5PYD having trouble with metadata attribute values being strings. Using H5PY, HDF5 files with metadata attribute values containing strings were created and then sent to the HSDS with H5PYD. However, when the files containing metadata string values were retrieved from the HSDS with hsget, the metadata attributes with the string values were completely stripped off of the HDF5 file. The hsget left the dataset and metadata attributes with non-string values intact.

To fix the issue with string metadata attribute values being stripped off of an HDF5 file using hsget:

In the utillib.py file under h5pyd/_apps, change lines 277-278 from this:

srcarr = np.asarray(data, order='C', dtype=src_dt) tgtarr = copy_array(srcarr, ctx)

to this:

if isinstance(data, str): tgtarr = np.string_(data) else: srcarr = np.asarray(data, order='C', dtype=src_dt) tgtarr = copy_array(srcarr, ctx)

jreadey commented 1 year ago

Somewhere along the line, the issue @MRossol reported has been fixed (with h5pyd version 0.10.3 or higher).

@jbhatch - I'm not sure if the issue you saw had the same root cause or not. If you are still seeing this could you open up a new issue with a repo case? I'll promise to respond with more alacrity this time. :)