Closed BenPoutine closed 5 years ago
You need to save the weights of the model as a HDF5 file. Try doing model.save_weights('my_weight.h5') instead.
@ymodak model.save_weights('my_weight.h5') model.save_weights('my_model.h5', save_format='h5') and model.save_weights('my_weight') witch should save in the TensorFlow checkpoint file format does not work when MirroredStrategy is use. All of them work if MirroredStrategyis not used.
sorry about that
You need to save inside distribution.scope()
, so this should work:
with distribution.scope():
model.save_weights('my_weight.h5')
Also if you're then trying to load weights under the MirroredStrategy
, I think it will only load onto the first tower (although maybe this is fixed?). Anyway you can look here for an example of how to do it.
You need to save inside
distribution.scope()
, so this should work:with distribution.scope(): model.save_weights('my_weight.h5')
Also if you're then trying to load weights under the
MirroredStrategy
, I think it will only load onto the first tower (although maybe this is fixed?). Anyway you can look here for an example of how to do it.
Did you get a chance to try this?
I'm at version 1.12.0, and model.save_weights('my_weight.h5')
works fine for a training model with MirroredStrategy.
I did ran into the callback issue as well. The following seems to work for me
ModelCheckpoint(filepath='...', save_weights_only=False)
# which is internally doing self.model.save(filepath, overwrite=True)
However this doesn't work and raises AttributeError: 'NoneType' object has no attribute 'save_weights'
like OP's issue.
ModelCheckpoint(filepath='...', save_weights_only=True)
# which is internally doing self.model.save_weights(filepath, overwrite=True)
This should be working at master, as we have unittests for save and load weights now: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/distribute/python/keras_test.py#L1158
It might be working with 1.13 rc as well, but not 100% sure. please try it out and let us know if it is still broken.
System information
Describe the current behavior Cannot save a tf.keras model if trained with MirroredStrategy either by calling save_weight or by a tf.keras.callbacks.ModelCheckpoint, but does work if MirroredStrategy is not used.
Code to reproduce the issue
And adding:
Error looks like:
Changing to:
Also does not work an end up with: