Closed bnlawrence closed 4 months ago
Yeah, well, I got this wrong:
The relevant loop inside pyfive h5netcdf
branch as it currently stands is this one:
dataobjs = DataObjects(self.file._fh, link_target)
if dataobjs.is_dataset:
if additional_obj != '.':
raise KeyError('%s is a dataset, not a group' % (obj_name))
return Dataset(obj_name, DatasetDataObject(self.file._fh, link_target), self)
return Group(obj_name, dataobjs, self)[additional_obj]
Which unfortunately means that when faced with an actual Datatype enum, it reports it as a group. That is definitely a bug.
At this point the logic is that everything that is not a dataset is a group, but we know that's not true, this is a portion of the h5dump of the offending content:
DATATYPE "enum_t" H5T_ENUM {
H5T_STD_U8LE;
"stratus" 1;
"cumulus" 2;
"nimbus" 3;
"missing" 255;
};
DATASET "enum_var" {
DATATYPE H5T_ENUM {
H5T_STD_U8LE;
"stratus" 1;
"cumulus" 2;
"nimbus" 3;
"missing" 255;
} ...
";
So we need to do something a wee bit different right at this point, so we can warn the user that we are ignoring the not implemented type rather than pretend it is a group, and then if they try and access enum_var
, they get a NotImplementedError.
Ok, so now - https://github.com/bnlawrence/pyfive/commit/99945989d215cd9fbd893d62dd2f0d06939631ab - at least pyfive can read a file with this in it, but raises a warning when it finds the datatype message, and raises a notimplementederror when one tries to read that particular data variable. That seems to be the right sort of behaviour until such time as we implement enum support (https://github.com/bnlawrence/pyfive/issues/9, which we may not).
Test support for "sensible things" is here: c12b5b3
H5py has a class
Datatype
. This is part of uber ticket #7!Unfortunately, when creating NetCDF groups, h5netcdf does it's own iteration over an HDF5 group, and looks for instances of DataTypes:
In this
self._root.h5py
is in fact, pyfive after my backend substitution. That means we need to have pyfive support for Datatype andcheck_enum_dtype
or I need another strategy.In pyfive, I think the
Datatype
corresponds toDatatypeMessage
. IF it's that simple, there are a couple of routes to solving this, we modify pyfive to look more like h5py, or we create a new Datatype class which subclasses DatatypeMessage. I suspect I will go that route ...The H5py datatype is defined here: https://github.com/h5py/h5py/blob/master/h5py/_hl/datatype.py