Model saving issues in JSON file format

eahn7 commented 3 years ago

Hi,

I am trying to save a MEGNet model via train method (defined in GraphModel class) as illustrated in the sample usage codes. If I understand correctly, the train method calls custom ModelCheckpointMAE then saves Keras hdf5 files in the callback directory by default.

I can see hdf5 files are generated as the training proceeds, but I do not see any json files associated with hdf5 files are constructed accordingly. It seems I do need both hdf5 and json files to load a trained MEGNet model via from_file method as per the sample notebook in save_and_load_model.ipynb.

I debugged some codes, and it seems that ModelCheckpointMAE actually saves the model through standard tf.keras.Model.save(). https://github.com/materialsvirtuallab/megnet/blob/6d5766d4134c04f9b1c41de0dab6a3c6234df7cb/megnet/callbacks.py#L136

Aren't we suppose to use save_model method defined in GraphModel class to save a model in json and hdf5 format? I have been trying to workaround the issue by changing self.model.save() to self.model.save_model(), but got an AttributeError: 'Functional' object has no attribute 'save_model'.

I think tf.keras.callbacks can only access to the Functional type, not the encapsulated models like MEGNet. Can someone help me with saving and loading a MEGNet model during training? Thank you.

chc273 commented 3 years ago

@eahn7 check point only saves the weights. You can save the full model by calling model.save_model after you have gotten a final model version. The json file stores the configuration and is the same for a given model training. The way I usually do is training the model to convergence first, then loading the best weight, and saving this model at last.

eahn7 commented 3 years ago

@chc273 I see. Thank you for your quick response.

materialsvirtuallab / megnet

Model saving issues in JSON file format #259