pycroscopy / SciFiReaders

Tools for extracting data and metadata from scientific data files
https://pycroscopy.github.io/SciFiReaders/about.html
MIT License
13 stars 13 forks source link

NSIDReader has issues when trying to import hdf5 file with multiple files #130

Open ramav87 opened 1 month ago

ramav87 commented 1 month ago

When a hdf5 file has multiple files, NSIDReader appears to fail.

Additionally, it is not possible to provide a HDF group when instantiating the reader, contrary to the documentation. Both of these are critical level functions that need to be fixed. For example, in the Intro to pycroscopy notebook, attempting to recover the fitted IV spectra from the generated hf5 file is impossible.

ramav87 commented 1 month ago

I have narrowed this down to a single line in pyNSID: line 143 in hdf_utils.py

setattr(dataset, key, h5_group_to_dict(dset.parent[key])[key])

Commenting out this line appears to solve the issue, however, this is clearly not ideal. The function already checks if dset.parent[key] is a h5group, but why it fails for some groups needs to be investigated.

File ~/Github/sidpy/sidpy/hdf/hdf_utils.py:818, in h5_group_to_dict(group_iter, group_dict)
    815 group_dict[group_iter.name.split('/')[-1]] = dict(group_iter.attrs)
    817 for key in group_iter.keys():
--> 818     h5_group_to_dict(group_iter[key], group_dict[group_iter.name.split('/')[-1]])
    819 return group_dict

File ~/Github/sidpy/sidpy/hdf/hdf_utils.py:811, in h5_group_to_dict(group_iter, group_dict)
    796 """ 
    797 Reads a hdf5 group into a nested dictionary
    798 
   (...)
    807 group_dict: dict
    808 """
    810 if not isinstance(group_iter, h5py.Group):
--> 811     raise TypeError('we need a h5py group to read from. Type given was {}'.format(type(group_iter)))
    812 if not isinstance(group_dict, dict):
    813     raise TypeError('group_dict needs to be a python dictionary')

TypeError: we need a h5py group to read from. Type given was <class 'h5py._hl.dataset.Dataset'>