Questions about start training from checkpoint using --recover

For some reason, during training:

I want to store the checkpoint after every epoch and start training the next epoch from the stored checkpoint.
I want the training state to remain continued among epochs. For example, when training epoch 2 from the checkpoint of epoch 1, the learning_rate_schedule, epoch nums...should be the same as if I train epoch 2 and epoch 1 together (the vanilla training process).

My implementation is using the argument --recover. Allennlp will store the checkpoint after every epoch. So, for epochs after the first, I add --recover to the training commands, wishing the model's parameters and training states will be restored. However, the above implementation seems wrong because, in my testing, training epoch 2 from the checkpoint of epoch 1 gives different results from training epoch 2 and 1 together. I tried hard to read the allennlp document but find difficult to figure the problem out. Any guys have comments on my implementation, or other ways to fulfill my requirements? Thanks a lot!!!

allenai / allennlp

Questions about start training from checkpoint using --recover #5722