Closed jreadey closed 1 month ago
In some cases selections on a H5D_CONTIGUOUS_REF dataset can fail. Looks like HSDS is sending a range get pass the end of the HDF5 file.
E.g.:
import h5pyd filename = "/nrel/ncdb/4km-Hourly-CONUS/v1.0.0/RCP4.5/ncdb_rcp4.5_2006.h5" f = h5pyd.File(filename, bucket="nrel-pds-hsds") meta = f["meta"] print(meta) chunk_shape = meta.chunks[0] layout = meta.id.dcpl_json["layout"] print(layout) index = 506167 chunk_id = index // chunk_shape item = meta[index] print(f"meta[{index}]: {item} chunk: {chunk_id}") index += 1 chunk_id = index // chunk_shape print(f"chunk_id: {chunk_id}") item = meta[index] # dies here print(f"meta[{index}]: {item}")
This is the corresponding log from the DN:
INFO> s3Client.get_object(4km-Hourly-CONUS/v1.0.0/RCP4.5/ncdb_rcp4.5_2006.h5[55010906968:55013029608] bucket=nrel-pds-ncdb) start=1726758646.2565 f inish=1726758651.1452 elapsed=4.8887 bytes=2121210 INFO> read: 2121210 bytes for key: 4km-Hourly-CONUS/v1.0.0/RCP4.5/ncdb_rcp4.5_2006.h5 WARN> requested 2122640 bytes but got 2121210 bytes DEBUG> _uncompress(compressor=None, shuffle=0) ERROR> Unable to retrieve chunk array: cannot reshape array of size 16317 into shape (16328,)
Fix is here: https://github.com/HDFGroup/hsds/pull/396
In some cases selections on a H5D_CONTIGUOUS_REF dataset can fail. Looks like HSDS is sending a range get pass the end of the HDF5 file.
E.g.:
This is the corresponding log from the DN: