Open neworderofjamie opened 1 year ago
If you resume training on a freshly-loaded model using an optimiser like Adam with internal state, it's unlikely to work very well as this state isn't checkpointed
If you resume training on a freshly-loaded model using an optimiser like Adam with internal state, it's unlikely to work very well as this state isn't checkpointed