lucfra / FAR-HO

Gradient based hyperparameter optimization & meta-learning package for TensorFlow
MIT License
186 stars 47 forks source link

Reproducing mini imagenet results #3

Closed haamoon closed 6 years ago

haamoon commented 6 years ago

Hi, I was wondering if you could share a code that can reproduce the mini imagenet results in you workshop paper. I have tried couple of different learning rates and the best one-shot test accuracy I could get was around 43%. I used T=4 as it is mentioned in the paper.

Thanks, Haamoon

lucfra commented 6 years ago

Hi! The code that I used is essentially the one in examples. Meta learning rate 0.001, decay rate 10-5 and T=5 during metatrain, mapping h as in the function mini_imagenet_model and meta batch size of 4. After I trained h, I performed a second meta validation on T, and it came out 4 in that case. By the way it is quite important to perform early stopping during metatrain by looking at the meta validation accuracy. It takes around 100k or more hyperiterations to reach the performance. Be also sure to have the right images and split.

Soonish I will release an updated version.

Let me know if it helps!

Luca

haamoon commented 6 years ago

Thanks Luca! We haven't try 100k in our experiment. We run a new experiment and let you know how it goes.

haamoon commented 6 years ago

Hi Luca,

So far we run the code for 40k hyper-iterations but I doesn't seem it is going to achieve the performance. I have attached the learning curve. mini_imagenet_reproduction_mean accuracy on test set

Do you have any suggestions on that? I can also share with you the dataset creation details.

lucfra commented 6 years ago

Hi Haamoon, Form the learning curve you attached I can see there is quite a big gap between what you call valid and test. In all my experiments the gap was no bigger then 1 or max 2%. There might be an error with the meta-test set.

Sorry if I might sound repetitive but just to be sure that we’re talking about the same thing... In these experiments we are interested in the so-called meta-test accuracy, which is an average accuracy over the test sets of many episodes not previously seen by the meta-learner, or in short over episodes in a meta-test set.

How do you create your meta-datasets?

nhatch commented 6 years ago

Hi @lucfra, I'm one of the people working with @haamoon trying to reproduce the experiment.

That's a good point. Since the validation meta-dataset images aren't used during the training process except for early stopping, one would expect the performance to be very similar to meta-testing.

Truth be told, I had some trouble finding the copy of Mini-Imagenet used in the Meta-Learner LSTM paper (Ravi and Larochelle), so I got a copy of something that looked similar (100 classes from ImageNet, 600 examples each) from another student in my building. Perhaps, in that dataset, the testing classes are somehow more difficult than the validation classes. I've contacted Sachin Ravi about where to find a more standard copy of Mini-Imagenet; we'll try running this experiment again once we get that version of the dataset.

Another detail about creating the meta-datasets is that we used the proc_images.py script from the MAML code. I believe that was necessary in order to get the dimensions to line up correctly.

Thanks for your help so far!

lucfra commented 6 years ago

I see. For data processing I've used something very similar to the script you mentioned.

Drop me an email at igor_mio@hotmail.it that I'll send you a link with the data. Anyway, it would be interesting (and also a little worrying) if the performance drop so much on another set of classes from imagenet...

Cheers, Luca