titusjan / argos

Argos: a data viewer that can read HDF5, NetCDF4, and other file formats.
GNU General Public License v3.0
176 stars 26 forks source link

Can't read HDF5 created with pandas. #11

Closed joneugster closed 5 years ago

joneugster commented 5 years ago

Hi,

Your tool looks amazing! Unfortunately it seems not to like my HDF5 files, created with pandas.

Maybe you could implement that, too. Best regards!

Reproduction:

Execute this in Python to create the HDF5 file:

import pandas as pd
import numpy as np
with pd.HDFStore('NOT_IMPLEMENTED.h5') as store:
    df = pd.DataFrame(np.random.rand(4, 4),
                           columns=['A', 'B', 'C', 'D'],
                           index=pd.date_range("20180101 00:00", periods=4))
    store.put('Test', df, format='table', data_columns=True)

Then open it in argos (argos NOT_IMPLEMENTED.h5) and open the group "Test" in the treeview.

Error Message

Bug: uncaught OSError Unable to read attribute (no appropriate function for conversion path)

``` Traceback (most recent call last): File "/anaconda3/lib/python3.7/site-packages/argos/qt/treemodels.py", line 91, in data return self.itemData(item, index.column(), role=role) File "/anaconda3/lib/python3.7/site-packages/argos/repo/repotreemodel.py", line 113, in itemData return super(RepoTreeModel, self).itemData(treeItem, column, role=role) File "/anaconda3/lib/python3.7/site-packages/argos/qt/treemodels.py", line 117, in itemData return item.decoration File "/anaconda3/lib/python3.7/site-packages/argos/repo/baserti.py", line 274, in decoration return rtiIconFactory.getIcon(self.iconGlyph, isOpen=not self.canFetchChildren(), File "/anaconda3/lib/python3.7/site-packages/argos/repo/rtiplugins/hdf5.py", line 359, in iconGlyph if self._h5Dataset.attrs.get('CLASS', None) == b'DIMENSION_SCALE': File "/anaconda3/lib/python3.7/_collections_abc.py", line 660, in get return self[key] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/anaconda3/lib/python3.7/site-packages/h5py/_hl/attrs.py", line 81, in __getitem__ attr.read(arr, mtype=htype) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5a.pyx", line 355, in h5py.h5a.AttrID.read File "h5py/_proxy.pyx", line 36, in h5py._proxy.attr_rw OSError: Unable to read attribute (no appropriate function for conversion path) ```
titusjan commented 5 years ago

Technical explanation: the problem seems to be that PyTables (which is used by Pandas to create the HDF files) stores some string attributes as fixed-length utf-8 strings. These can not be read by the H5Py library that Argos uses to read the HDF-5 files.

The best solution would be that the issue is fixed in H5Py. Unfortunately the bug is more that two years old so I don't see it being fixed soon. See https://github.com/h5py/h5py/issues/585

I've made a work-around so that Argos at least open the file, but some attributes cannot be read. See the screenshot below.

screen shot 2018-12-28 at 01 40 49

The work around is in the development branch. You can test try that if you like.

joneugster commented 5 years ago

Wonderful, thanks!

To be honest, I assume that most users don't need those attributes anyways, it is usually more important to see the values of the data.

Now I only have an import error coming from not finding pgcolorbar, but I can download that from your other repo.