openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"
https://arxiv.org/abs/1803.02999
MIT License
989 stars 210 forks source link

Bump into some accuracy problem #8

Closed jaegerstar closed 6 years ago

jaegerstar commented 6 years ago

I ran the 5-shot 5-way Mini-ImageNet experiment using the command in readme. But when it finished, I got this.

default

Are the hyperparameters optimal ? Or is there anything else I should pay attention to ? Thanks.

unixpickle commented 6 years ago

Did you run the transductive or non-transductive version? The accuracy for transduction is slightly better. Also, note the error margins on these experiments.

jaegerstar commented 6 years ago

@unixpickle I recap the paper. My mistanke. I thought the accuarcy should be above 90% at least. I am a newbie to meta learning area. I ran the non-transductive version, I will switch to omniglot to try again.Thanks for your quick response!

jaegerstar commented 6 years ago

@unixpickle Two more question.

  1. How did you determine the optimal value of meta_batch_size without any validation process within the outer loop?
  2. Why did you include the test set in your training process rather than validation set? It seems a bit odd.
unixpickle commented 6 years ago

For 1: for mini-imagenet, HPs were tuned to maximize validation performance. For omniglot, they we're tuned to maximize training performance, which tended to correlate well with test performance. The CMA hyperparameter optimization code is not included in this repo since it is rather specific to our infrastructure.

For 2: the run scripts in this repo are mainly intended to be used to reproduce our results. If you want to further optimize hyper-parameters, you will want to make sure to only look at the outputs for the validation/training set.