HDFGroup / datacontainer

Data Container Study
Other
8 stars 1 forks source link

Can't read szip data #27

Closed jreadey closed 8 years ago

jreadey commented 8 years ago

h5py can read the other compressed formats (blosc, mafic, zlib), but not szip. Reading the data raised the error: OSError: Can't read data (Required filter (name unavailable) is not registered)

Strangely, h5dump works fine and the dataset values get printed normally.

hyoklee commented 8 years ago

I think something must be wrong in your environment. I have no issue in reading szip using python. See the test script below and result on joe-test-issue25 instance at OSDC that runs issue25 image.

ubuntu@joe-test-issue25:~/data$ python test_compression.py 
-21.3293
ubuntu@joe-test-issue25:~/data$ more test_compression.py
import h5py
import numpy
# file_path = 'GSSTF_NCEP.3.1987.07.01.blosc.he5'
file_path = 'GSSTF_NCEP.3.1987.07.01.szip.he5'
h5path = '/HDFEOS/GRIDS/NCEP/Data Fields/Tair_2m'
with h5py.File(file_path, 'r') as f:
    dset = f[h5path]

    # mask fill value
    if '_FillValue' in dset.attrs:
        arr = dset[...]
        fill = dset.attrs['_FillValue'][0]
        v = arr[arr != fill]
    else:
        v = dset[...]
        # file name GSSTF_NCEP.3.YYYY.MM.DD.he5

    # Python 3 style.
    print(numpy.min(v))
jreadey commented 8 years ago

@hyoklee - Would you try out using snapshot13? I'm not clear on what is different about the env setup there compared to the issue25 image.

hyoklee commented 8 years ago

@jreadey snapshot13 uses anaconda which I don't use. I built Python, hdf5, h5py from scratch for issue25 image.

Last login: Tue Nov 24 03:11:16 2015 from 172.17.192.2
discarding /home/ubuntu/anaconda/bin from PATH
prepending /home/ubuntu/anaconda/envs/py34/bin to PATH