svdhoog / FLAViz

FLAViz: Flexible Large-scale Agent Visualization Library
GNU General Public License v3.0
1 stars 4 forks source link

HDF5 pytables encoding #15

Closed svdhoog closed 3 years ago

svdhoog commented 6 years ago

1 Make sure that writing HDF5 files with pytables is using the same format when encoding and decoding (current code uses utf-8 or unicode by default?)

See: https://github.com/pandas-dev/pandas/issues/11126

2 Compatibility between HDF5 and xarray See here: http://xarray.pydata.org/en/stable/io.html

"Chunk based compression zlib, complevel, fletcher32, continguous and chunksizes can be used for enabling netCDF4/HDF5’s chunk based compression"

Settings in the script merge_hdf_agentwise

            pd.set_option('io.hdf.default_format','table')
...
            #store_out = pd.HDFStore(outFileName, 'w')  # to store without compression
            store_out = pd.HDFStore(outFileName, 'w', chunksize = 500, complevel = 1, complib ='bzip2', fletcher32 = True) # store with compression

Error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: unexpected end of data
store = pd.HDFStore(fname_tostore, 'w', chunksize = 500, complevel = 1, complib ='bzip2', fletcher32 = True)

Full details:

Traceback (most recent call last):
  File "/home/svdhoog/files-server/GIT/GitHub/FLAViz@0xfabi/ETACE/src/visualisation_scripts/main.py", line 141, in <module>
    agent_storelist[key] = pd.io.pytables.HDFStore(f_p)
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/pytables.py", line 464, in __init__
    self.open(mode=mode, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/pytables.py", line 628, in open
    raise e
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/pytables.py", line 603, in open
    self._handle = tables.open_file(self._path, self._mode, **kwargs)
  File "/usr/lib/python3/dist-packages/tables/file.py", line 318, in open_file
    return File(filename, mode, title, root_uep, filters, **kwargs)
  File "/usr/lib/python3/dist-packages/tables/file.py", line 826, in __init__
    root._g_post_init_hook()
  File "/usr/lib/python3/dist-packages/tables/group.py", line 265, in _g_post_init_hook
    if 'VERSION' in self._v_attrs._v_attrnamessys:
  File "/usr/lib/python3/dist-packages/tables/utils.py", line 244, in newfget
    mydict[name] = value = fget(self)
  File "/usr/lib/python3/dist-packages/tables/node.py", line 174, in _v_attrs
    return self._AttributeSet(self)
  File "/usr/lib/python3/dist-packages/tables/attributeset.py", line 243, in __init__
    self.__getattr__(attr)
  File "/usr/lib/python3/dist-packages/tables/attributeset.py", line 295, in __getattr__
    value = self._g_getattr(self._v_node, name)
  File "hdf5extension.pyx", line 755, in tables.hdf5extension.AttributeSet._g_getattr (tables/hdf5extension.c:7234)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: unexpected end of data
Closing remaining open files:/home/svdhoog/backup/Big_Archive_of_Everything/Estimation_and_Calibration/Data/calibration-mode-3/h5_agentwise/Eurostat.h5...done
svdhoog commented 6 years ago

To create new full h5 file with merge_hdf_agentwise and check if plotting from this 21 GB file now works.

svdhoog commented 3 years ago