This PR adds state_dict()/load_state_dict() methods to allow saving and later restoring the state of an EMA object. (This is useful, for example, when restarting training — especially at high decay, maintaining the shadow weights through a restart is important for avoiding artifacts in the validation loss as well has having the best final model.)
This PR adds
state_dict()
/load_state_dict()
methods to allow saving and later restoring the state of an EMA object. (This is useful, for example, when restarting training — especially at high decay, maintaining the shadow weights through a restart is important for avoiding artifacts in the validation loss as well has having the best final model.)Based somewhat on
state_dict()
/load_state_dict()
fortorch.optim.Optimizer
: https://pytorch.org/docs/stable/_modules/torch/optim/optimizer.html#Optimizer