I run the model, but it doesn't convergence. Is there any point that I need to pay attention to?

openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"

https://arxiv.org/abs/1803.02999

MIT License

996 stars 214 forks source link

I run the model, but it doesn't convergence. Is there any point that I need to pay attention to? #3

Open sccbhxc opened 6 years ago

sccbhxc commented 6 years ago

I run the following code:

# transductive 1-shot 5-way Omniglot.
python -u run_omniglot.py --shots 1 --inner-batch 25 --inner-iters 3 --meta-step 1 --meta-batch 10 --meta-iters 100000 --eval-batch 25 --eval-iters 5 --learning-rate 0.001 --meta-step-final 0 --train-shots 15 --checkpoint ckpt_o15t --transductive

The output results is "batch XXX: train=0.000000 test=0.000000". Is there any wrong?

unixpickle commented 6 years ago

Could you show multiple lines of the output? Each line in the output corresponds to a single task evaluation, not an average, so there will be some zeros unless the model is perfect. Does every single line look like that?

By the way, a better way to see the results is to let the run finish, at which point a full evaluation is performed. With those arguments, this will be after 100K iterations. You can also use tensorboard to see smoothed learning curves during training.

sccbhxc commented 6 years ago

@unixpickle I fails to upload the picture of tensorboard curves yesterday. The training accuracy curves I get is as follow. 2018-03-16_170534

unixpickle commented 6 years ago

TensorBoard has a smoothing option, which should make the curves easier to read.

Some things to check:

Did you definitely download all of Omniglot? There should be ~30K images.
If you let the script run for the full 100K iterations, what accuracy does it output at the end?
What version of TensorFlow/Python are you using?