openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"
https://arxiv.org/abs/1803.02999
MIT License
996 stars 214 forks source link

Training hyperparameters #26

Open lhfowl opened 4 years ago

lhfowl commented 4 years ago

Hello,

I'm hoping to confirm that the hyperparameters specified in your paper are correct. Specifically, for miniimagenet, 100k meta steps were taken during training? I ask because it seems some of the default values in the code are different.

unixpickle commented 4 years ago

See the readme: https://github.com/openai/supervised-reptile/blob/master/README.md#reproducing-training-runs which describes how to run the experiments by specifying the correct arguments

siavash-khodadadeh commented 4 years ago

In the readme file train-shots is 10, however, it seems that this should be 5-way 1-shot. Am I missing something?

transductive 1-shot 5-way Omniglot.

python -u run_omniglot.py --shots 1 --inner-batch 10 --inner-iters 5 --meta-step 1 --meta-batch 5 --meta-iters 100000 --eval-batch 5 --eval-iters 50 --learning-rate 0.001 --meta-step-final 0 --train-shots 10 --checkpoint ckpt_o15t --transductive

siavash-khodadadeh commented 4 years ago

Does this mean that during meta-learning we are training with 5-way 10-shot? but for the test, we evaluate on 5-way, 1-shot?

MrDavidG commented 4 years ago

@siavash-khodadadeh I have the same question. And I notice that the paper said

If we are doing K-shot, N-way classification, then we sample tasks by selecting N classes from C and then selecting K + 1 examples for each class. We split these examples into a training set and a test set, where the test set contains a single example for each class. 

It is a different setting with maml in evalution, maml use k_qry=15 (that is 15 examples for each class) to evaluate itself. It seems like the comparison in the experiment is unfair?