Closed rogerkuou closed 10 months ago
Hi @SarahAlidoost, this is the data model linking problem we talked about in the morning. Feel free to pick this up when you are available.
I found that a model keras can be saved/loaded in HDF5, h5py
is one of the dependencies of the keras, see keras doc and tensorflow doc. This way we can add metadata to the attributes of an HDF5 file when saving a Keras model. In dnn.py
module, we are using self.model.save(path_model)
and next to it, hyperparameters are saved in separate pickle files. However, with hdf5
format, it is possible to save both metadata of training datasets and hyperparameters as attributes in the same file.
see draft implementation https://github.com/VegeWaterDynamics/motrainer/pull/113
In the daskml example and the dnn example, we showed two cases of ML training on splitted data (per grid cell). But for now it is not very easy to connect the trained ML models back to the partition of the data.
The solution for now can be we save the spatio-temporal coordinates of the partition we used, and save this coordinates info as a metadata along with the output model.
Todos:
Update the model exportation part of the two example notebooks:
Update the usage page of daskml and dnn with code examples of writing this information.