AntreasAntoniou / HowToTrainYourMAMLPytorch

The original code for the paper "How to train your MAML" along with a replication of the original "Model Agnostic Meta Learning" (MAML) paper in Pytorch.
https://arxiv.org/abs/1810.09502
Other
773 stars 137 forks source link

Different performance on loading from saved parameters #9

Closed sholtodouglas closed 5 years ago

sholtodouglas commented 5 years ago

Hi,

Great code, and really informative write up on the blog! I'm trying to adapt this for a robotic imitation learning, and I'm getting great performance on past task demonstrations during training. However, on loading the saved parameters the loss is significantly worse until it's had some iterations to train - which affects using the saved parameters in new sims.

screen shot 2019-02-18 at 8 34 35 am

Do you have any thoughts on this? Its not the per step importance vectors, its not the task learning rates from what I can tell.

AntreasAntoniou commented 5 years ago

I'd have to actually look at your code repo to help on this. That being said, one of the most frequent culprits is batch normalization. Make sure the per-step summary statistics are loaded properly. If you find out what's wrong please, do let me know, so I can rectify the issue on my repo.

AntreasAntoniou commented 5 years ago

I forgot to mention, that I did test my system explicitly on performance consistency before/after state re-loading and it passed the test with a perfect score. This leads me to believe, that, one of my more recent commits either broke that feature or your framework has a bug somewhere. The former is less likely because I am actually using a variant of this repo in my current research and my models seem to have consistent behaviour before/after parameter re-loading.

marcociccone commented 5 years ago

Hi, I'm playing with the code and one reason for this behaviour might be that you are not saving the state of the outer loop optimizer (Adam), so when you restart the experiment all the running grads are re-initialized messing up the training. This is not an issue at testing time, only at training.