HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
114 stars 38 forks source link

h5pyd and h5py slicing difference #87

Closed MRossol closed 4 years ago

MRossol commented 4 years ago

h5pyd and h5py handle the following slicing differently:

with h5py.File(path, 'r') as f:
    data = f['dataset'][:, :1]

data.shape -> (n, 1)

with h5pyd.File(path, 'r') as f:
    data = f['dataset'][:, :1]

data.shape => (n)

Example:

path = '/datasets/NSRDB/v3/nsrdb_2013.h5'
with h5py.File(path, 'r') as f:
    dni = f['dni'][:, :1]

array([[443.], [530.], [528.], ..., [698.], [698.], [698.]], dtype=float32)

path = '/nrel/nsrdb/v3/nsrdb_2013.h5'
with h5pyd.File(path, 'r') as f:
    dni = f['dni'][:, :1]

array([443., 530., 528., ..., 698., 698., 698.], dtype=float32)

jreadey commented 4 years ago

I had the impression that h5py reduced the number of dimensions by removing single dimensions in this case. It must have been a previous version...

Anyway, that above commit should make the slicing compatible.