data-exchange / dxchange

data exchange supporting tomopy
http://dxchange.readthedocs.io
Other
34 stars 42 forks source link

Fixes IndexError on read_hdf_meta #100

Closed swelborn closed 2 years ago

swelborn commented 2 years ago

The new dxchange.reader.read_hdf_meta method fails when you write a generic hdf or dxf using dxchange. This PR fixes this, but I'm not sure this is the behavior you want.

Code to reproduce:

import dxchange.writer
import numpy as np
a = np.random.rand(100,100,100)
dxchange.writer.write_dxf(a, r"C:\Users\samwe\Desktop\fake_dxf_data.h5")
_, meta = dxchange.reader.read_hdf_meta(r"C:\Users\samwe\Desktop\fake_dxf_data.h5")
_

Produces the following error:



Input In [4], in <module>
      5 a = np.random.rand(100,100,100)
      6 dxchange.writer.write_dxf(a, r"C:\Users\samwe\Desktop\fake_dxf_data.h5")
----> 7 _, meta = dxchange.reader.read_hdf_meta(r"C:\Users\samwe\Desktop\fake_dxf_data.h5")
      8 _

File c:\users\samwe\dxchange\dxchange\reader.py:670, in read_hdf_meta(fname, add_shape)
    667 meta = {}
    669 with h5py.File(fname, 'r') as hdf_object:
--> 670     _extract_hdf(tree, meta, hdf_object, add_shape=add_shape)
    671 # for entry in tree:
    672 #     print(entry)
    673 return tree, meta

File c:\users\samwe\dxchange\dxchange\reader.py:767, in _extract_hdf(tree, meta, hdf_object, prefix, key, level, add_shape)
    765 if index == 0:
    766     tree.append(prefix + PIPE)
--> 767 _add_branches(tree, meta, hdf_object, key, key1, index, last_index,
    768               prefix, connector, level, add_shape)

File c:\users\samwe\dxchange\dxchange\reader.py:715, in _add_branches(tree, meta, hdf_object, key, key1, index, last_index, prefix, connector, level, add_shape)
    713 if isinstance(obj, h5py.Dataset):
    714     shape = str(obj.shape)
--> 715     if obj.shape[0]==1:
    716         s = obj.name.split('/')
    717         name = "_".join(s)[1:]

IndexError: tuple index out of range```
decarlof commented 2 years ago

@samwelborn can you share the fake_dxf_data.h5 somewhere?

swelborn commented 2 years ago

You can make the fake_dxf_data.h5 by running the first commands, i.e.:

import dxchange.writer
import numpy as np
a = np.random.rand(100,100,100)
dxchange.writer.write_dxf(a, r"C:\Users\samwe\Desktop\fake_dxf_data.h5")  # change this to your folder

You can also use write_hdf5 with the same result.

decarlof commented 2 years ago

@samwelborn thanks for catching this. Indeed it fails on /implements as it returns (). Your change solved the issue