openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"
https://arxiv.org/abs/1803.02999
MIT License
996 stars 214 forks source link

About miniimagenet model #6

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi, I noticed that the model used for miniimagenet is slightly different from the model in MAML. In MAML, each conv layer is followed by a batchnorm layer then a maxpool layer, whereas in this code each conv layer is followed by a maxpool layer then a batchnorm layer, is there any reason why you swap the order of batchnorm and maxpool?

Also, does the miniimagenet model in reptile tend to overfit when the number of kernel in each conv layer is larger than 32?

unixpickle commented 6 years ago

Good find! I'm going to re-run the experiments now with the corrected model. Our Mini-ImageNet results were actually slightly better than MAML's, and now I suspect it's because of the architecture change.

I haven't tried different models for Mini-ImageNet--I just copied (or thought I copied) the MAML model. It would be fairly easy to test though.

ghost commented 6 years ago

OK, thanks for the reply.