We may save the training state in the output directory. If a training script is executed, it may check the output directory specified for any prior training runs and resume those automatically if hyperparameters are compatible. If incompatible, a error should be raised.
Description
We may save the training state in the output directory. If a training script is executed, it may check the output directory specified for any prior training runs and resume those automatically if hyperparameters are compatible. If incompatible, a error should be raised.
References
Sockeye implementation:
https://github.com/awslabs/sockeye/blob/29795b828593ca68cfe923d611b67e079bc0dca9/sockeye/train.py#L138-L152