Closed tomeichlersmith closed 2 years ago
Using h5dmp
, I was able to deduce the H5Type that h5py uses for serlizing bools:
DATATYPE H5T_ENUM {
H5T_STD_I8LE;
"FALSE" 0;
"TRUE" 1;
}
In C++ land, this is
enum class BOOL : signed char {
FALSE = 0,
TRUE = 1
};
I was able to test this by writing out bools-py.h5
with write-bools.py (below) and bools-cpp.h5
with the executable compiled from write-bools.cxx (below). And then reading both of them with read-bools.py
(below). Both H5 files written by C++ or Python were read in seamlessly by h5py and interpreted into Python bools.
Now I just need to figure out how to put this enum type into the type deduction tree that is currently in fire
.
import h5py
import numpy as np
with h5py.File('bools-py.h5','w') as f :
dset = f.create_dataset('mybools',(10,),dtype=bool)
dset[::] = np.full((10),True)
Compile with h5c++
to avoid extra linking parameters. Done in hdf5 container so HighFive is installed in system path.
#include <highfive/H5File.hpp>
using namespace HighFive;
enum class BOOL : signed char {
FALSE = 0,
TRUE = 1
};
EnumType<BOOL> create_enum_bool() {
return {{"FALSE", BOOL::FALSE},{"TRUE", BOOL::TRUE}};
}
HIGHFIVE_REGISTER_TYPE(BOOL, create_enum_bool)
int main() try {
File f("bools-cpp.h5", File::ReadWrite | File::Create | File::Truncate);
std::vector<BOOL> data = {BOOL::TRUE, BOOL::TRUE, BOOL::TRUE};
std::cout << data.size() << std::endl;
auto dset = f.createDataSet("mybools", DataSpace(data.size()), create_enum_bool());
dset.write(data);
f.flush();
return 0;
} catch (const Exception& e) {
std::cerr << " [H5 Error] : " << e.what() << std::endl;
return 1;
}
import h5py
import sys
with h5py.File(sys.argv[1]) as f :
print(f['mybools'][...])
Currently, we are just serializing bools into shorts. This is not a very satisfactory solution, especially the data copying necessary to get around the vector sepcialization in C++.
The solution is to implement a bool<->enum mapping and serialize the enum. This has already been done by h5py and would mean that opening a boolean dataset in h5py would work 'out of the box'.