Open relaxation82 opened 4 years ago
Hi @relaxation82, thanks so much for reporting this! Yes this is the edge case that we have not considered! Will post a fix soon!
Hi,
I dive a bit deeper into the issue and I think that currently it is not possible to get initial_epoch
into callback (the code in TensorFlow source looks like they are using initial_epoch
more to fetch the previous training data), therefore it wouldn't be possible to fix this without either update from the TF side to allow Callback
side to access initial_epoch
(for example model.fit
will populate self.params["initial_epoch"]
for callbacks, or we ask the user to provide that parameter again in the callback. I think the 1st approach (ask TF to update on their side) is the correct approach, I will add a issue request on TF side to see what are their opinions, thanks you so much!
I have raised an issue on TensorFlow's side, hopefully we will hear back soon :)
System information
Describe the bug
Progress bar of the whole training process over multiple epochs never reaches 100% when the training is restarted, and parameter initial_epoch is therefore nonzero.
Code to reproduce the issue
Essentially
model.fit(x=X, y=y, class_weight=None, batch_size=batchSize, verbose=0, callbacks=tfa.callbacks.TQDMProgressBar(), validation_split=0.2, shuffle=True, epochs=epochCount, initial_epoch=initialEpoch)
where initialEpoch > 0
Other info / logs
The issue should be rather clear with the provided info