Closed Blubbaa closed 3 years ago
savemat
is also broken with h5pi 3.x. The following code stopped working:
hdf5storage.savemat(file, data, format='7.3', oned_as='row', store_python_metadata=True, matlab_compatible=True)
Reverting to h5pi to 2.10.0 lets it work with the following warning:
\lib\site-packages\hdf5storage__init__.py: 1234 : H5pyDeprecationWarning: The default file mode will change to 'r' (read-only) in h5py 3.0. To suppress this warning, pass the mode you need to h5py.File(), or set the global default h5.get_config().default_file_mode, or set the environment variable H5PY_DEFAULT_READONLY=1. Available modes are: 'r', 'r+', 'w', 'w-'/'x', 'a'. See the docs for details. f = h5py.File(filename)
Sorry I have taken so long to get around to this.
The problem appears is a backwards incompatible change in h5py or a bug. Specifically, the problem comes up with reading the 'MATLAB_fields' Attribute which has a quite unusual type. It can be written, but it can no longer be read in any way except probably through h5py's low level API which is no longer documented.
The bug shows up if one does the following to make an Attribute with the same type
>>> import numpy, h5py
>>> dt = h5py.vlen_dtype(numpy.dtype('S1'))
>>> a = numpy.empty((1, ), dtype=dt)
>>> a[0] = numpy.array([b'a', b'b'], dtype='S1')
>>> f = h5py.File('data.h5', mode='a')
>>> f.attrs.create('test', a)
>>> f.attrs['test']
The output from h5dump data.h5
is
HDF5 "data.h5" {
GROUP "/" {
ATTRIBUTE "test" {
DATATYPE H5T_VLEN { H5T_STRING {
STRSIZE 1;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): ("a", "b")
}
}
}
}
I am going to bring this up with h5py and see what can be done about it, including whether there is a good work around using the low level API (the more raw libhdf5 bindings).
Workarounds added in commit 3008efs for the main branch and commit a63128b for the 0.1.x branch. The package should now work for h5py 3.0 and 3.1. I will be uploading version 0.1.16 to PyPI shortly.
Fixed for 32-bit little endian systems in commit 9f021ee for the 0.1.x branch and commit c8a306e for the main branch. I still don't know if it works on big-endian systems.
Had a bug in the commits fixing the issues on 32-bit systems. Recent commits fix that.
Just released version 0.1.17 on PyPI which includes the fix.
I have recently upgraded to h5py 3.0.0, as i need some of the new features. As #101 also pointed out, currently hdf5storage is broken when using 3.0.0. However for me using the master branch with v2.0 does not fix it. I am adding an example here, as I am frequently loading v7.3
.mat
files from Matlab.The following code produces an
ValueError
, which is actually hidden if you supply a list of variable_names. After some debugging and reading the change list from 3.0, I still don't really understand exactly whats going wrong there. It seems to read an attribute named 'MATLAB_fields' from the file, thats where it fails.Example
Output