openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"
https://arxiv.org/abs/1803.02999
MIT License
989 stars 210 forks source link

Difference between shots and train_shots #7

Closed nattari closed 6 years ago

nattari commented 6 years ago

I believe num_shots in the code explain number of examples for each class. In the train function the initialization is given as "num_shots = train_shots or num_shots".

Now for 1-shot 5-way example (given), python -u run_omniglot.py --shots 1 --inner-batch 25 --inner-iters 3 --meta-step 1 --meta-batch 10 --meta-iters 100000 --eval-batch 25 --eval-iters 5 --learning-rate 0.001 --meta-step-final 0 --train-shots 15 --checkpoint ckpt_o15t --transductive

For above case, train-shots = 15 and shots=1, now num_shots in the code would get 15 but it should be 1 as it is one example per class (1-shot, 5-way). Maybe I am missing something, can you please clarify?

nattari commented 6 years ago

For omniglot data, is there a initializer (initial weights) you use to start the training with? As I see in the code, it seems random initialization. Is it true?

unixpickle commented 6 years ago

In response to your first question: shots is the number of examples per class during evaluation. However, we found that we got better performance on 1-shot and 5-shot classification if we trained the model using more examples per class (e.g. 15). Train shots is thus the number of examples per class used at training time.

Regarding your second question: we use TensorFlow's default random initialization.

nattari commented 6 years ago

Thanks for the response.