prjemian / punx

Python Utilities for NeXus HDF5 files
https://prjemian.github.io/punx
5 stars 7 forks source link

BUG: tree cannot parse some NeXus examples #130

Closed prjemian closed 2 years ago

prjemian commented 3 years ago

While looking for examples of the depends_on attribute, encountered TypeError: Can't iterate over a scalar dataset exception when reading two of the files with punx tree.

(bluesky_2020_9) mintadmin@mint-vm:~/.../NeXus/exampledata$ git grep depends_on
Binary file DLS/i03_i04_NXmx/hdf5/Therm_6_2.nxs matches
Binary file DLS/i16/hdf5/538039.nxs matches
Binary file DLS/reflections/hdf5/thaumatin_integrated.nxs matches
Binary file DLS/reflections/hdf5/thaumatin_integrated_multisample.nxs matches
Binary file SwissFEL/hdf5/lyso009a_0087.JF07T32V01_master.h5 matches
Binary file autogenerated_examples/nxdl/applications/NXmx.hdf5 matches
nxdl/nxdl_validate_out.md:definition=NXmx.nxdl.xml message="Cannot even find the starting point of the depends_on chain, !some char data!" nxdlPath=/NXentry/NXsample/name sev=error dataPath=/untitled_entry/untitled_sample/depends_on dataFile=NXmx.hdf5
nxdl/nxdl_validate_out.md:definition=NXmx.nxdl.xml message="Cannot even find the starting point of the depends_on chain, !some char data!" nxdlPath=/NXentry/NXinstrument/NXdetector sev=error dataPath=/untitled_entry/untitled_instrument/untitled_detector/depends_on dataFile=NXmx.hdf5
file readable
DLS/i03_i04_NXmx/hdf5/Therm_6_2.nxs no
DLS/i16/hdf5/538039.nxs yes
DLS/reflections/hdf5/thaumatin_integrated.nxs yes
DLS/reflections/hdf5/thaumatin_integrated_multisample.nxs yes
SwissFEL/hdf5/lyso009a_0087.JF07T32V01_master.h5 no
autogenerated_examples/nxdl/applications/NXmx.hdf5 yes

Example exception:

(bluesky_2020_9) mintadmin@mint-vm:~/.../NeXus/exampledata$ punx tree SwissFEL/hdf5/lyso009a_0087.JF07T32V01_master.h5 | grep depends_on
Traceback (most recent call last):
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/bin/punx", line 10, in <module>
    sys.exit(main())
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/main.py", line 456, in main
    args.func(args)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/main.py", line 199, in func_tree
    report = mc.report(args.show_attributes)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 65, in report
    tree_string_list = self._renderGroup(f, txt, indentation = "")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 116, in _renderGroup
    s += self._renderGroup(value, itemname, indentation+"  ")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 104, in _renderGroup
    s += self._renderDataset(value, itemname, indentation+"  ")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 142, in _renderDataset
    txType = self._renderDsType(dset)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 195, in _renderDsType
    [str(o.dtype.itemsize) for o in obj])
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 195, in <listcomp>
    [str(o.dtype.itemsize) for o in obj])
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 472, in __iter__
    raise TypeError("Can't iterate over a scalar dataset")
TypeError: Can't iterate over a scalar dataset

Another example exception:

(bluesky_2020_9) mintadmin@mint-vm:~/.../NeXus/exampledata$ punx tree DLS/i03_i04_NXmx/hdf5/Therm_6_2.nxs

!!! WARNING: this program is not ready for distribution.

Traceback (most recent call last):
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/bin/punx", line 10, in <module>
    sys.exit(main())
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/main.py", line 456, in main
    args.func(args)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/main.py", line 199, in func_tree
    report = mc.report(args.show_attributes)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 65, in report
    tree_string_list = self._renderGroup(f, txt, indentation = "")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 116, in _renderGroup
    s += self._renderGroup(value, itemname, indentation+"  ")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 104, in _renderGroup
    s += self._renderDataset(value, itemname, indentation+"  ")
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 142, in _renderDataset
    txType = self._renderDsType(dset)
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 195, in _renderDsType
    [str(o.dtype.itemsize) for o in obj])
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/punx/h5tree.py", line 195, in <listcomp>
    [str(o.dtype.itemsize) for o in obj])
  File "/home/mintadmin/Apps/anaconda/envs/bluesky_2020_9/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 472, in __iter__
    raise TypeError("Can't iterate over a scalar dataset")
TypeError: Can't iterate over a scalar dataset
prjemian commented 2 years ago

The files cited are in the NeXus example data repository.

prjemian commented 2 years ago

The data file DLS/i03_i04_NXmx/hdf5/Therm_6_2.nxs is a great test case since it has a dataset /entry/data/data_000001 that is an external file link to file Therm_6_2_000001.h5 and path /data. The external file is not available.

prjemian commented 2 years ago

And, it is much smaller than SwissFEL/hdf5/lyso009a_0087.JF07T32V01_master.h5. But the SwissFEL is a better example since it has both a missing external file link and a soft link to the missing data set:

NeXus/exampledata/SwissFEL/hdf5/lyso009a_0087.JF07T32V01_master.h5 : NeXus data file
  entry:NXentry
    @NX_class = NXentry
    definition:NX_CHAR = NXmx
    data:NXdata
      @NX_class = NXdata
      data: external file missing
        @file = lyso009a_0087.JF07T32V01.h5
        @path = data/data
    instrument:NXinstrument
      @NX_class = NXinstrument
      ELE_D0:NXdetector
        @NX_class = NXdetector
        data: --> /entry/data/data
        pixel_mask:NX_INT32[16448,1030] = __array
...