HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
110 stars 39 forks source link

`hsload` fails when an attribute has type `Reference` #139

Closed fsvenson closed 1 year ago

fsvenson commented 1 year ago

Seems like there are several problems happening, but the major one that prevents hsload from finishing even with the --ignore flag enabled is due to this line where an AttributeError is raised due to data not having a dtype attribute when data is a Reference. Fixing this with a hasattr() check lets hsload finish. However, the Reference attribute is still not created due to the following error:

utillib.py:356 ERROR: failed to create attribute PALETTE of object /0_0_0/data -- Object of type Reference is not JSON serializable
jreadey commented 1 year ago

Hi, I checked with a test file that contains a dataset with reference types, but didn't see any problems with hsload. Can you share your file? I'll take a look. Thanks!

fsvenson commented 1 year ago

Sure, here is a test file: test.zip It's a NetCDF4 pattern file. Some data variables have custom compression(which I understand makes them incompatible wirth HSDS in other ways) but I assume that should not be the cause of the problem in this case.

jreadey commented 1 year ago

Hey @fsvenson,

Sorry for taking so long to get to this. I have a new h5pyd version: 0.13.1 that should fix the issue. I got a failure doing hsload test.nc /home/john/ but that seems to be in the dataset values copy because of the custom compression. Running: hsload --nodata test.nc /home/john/ went fine. I expect even the first hsload will work if you have the right custom filter plugin. Anyway, let me know how it goes.

jreadey commented 1 year ago

This is resolved in h5pyd version 0.14.0. Added a test file that use reference attributes ("a_objref.h5") to the load_files.py test. The file is used with hsload and then the resulting domain is used with hsget to verify everything is working correctly.