yaoyao-liu / meta-transfer-learning

TensorFlow and PyTorch implementation of "Meta-Transfer Learning for Few-Shot Learning" (CVPR2019)
https://lyy.mpi-inf.mpg.de/mtl/
MIT License
736 stars 149 forks source link

Unclear meta test phase #3

Closed a7b23 closed 5 years ago

a7b23 commented 5 years ago

Hi, in the META phase, you are running the test command for MAX_ITER = 20k steps. What is the need for doing this? How do you compute the k-shot accuracy then?

Should this not run for only 1 step?

yaoyao-liu commented 5 years ago

Thanks for your interests in our project.

This experiment contains three phases: pre-train, meta-train, and meta-test

If you run the command like python run_experiment.py META, both the meta-train phase and meta-test phase will be processed. In meta-train phase, the model will be updated on the meta-train set for 20k iterations; in meta-test phase, the k-shot accuracy will be calculated.

MAX_ITER indicates the total iteration number for the meta-train phase. For meta-test phase, there is no meta update. If you only want to do meta-test, you may comment out the related lines for meta-train in run_experiment.py (line 98 to 100) as follows:

    if PHASE=='META':
        #print('****** Start Meta-train Phase ******')
        #meta_train_command = base_command + ' --phase=meta' + ' --pretrain_iterations=' + str(PRE_ITER)
        #os.system(meta_train_command)

        print('****** Start Meta-test Phase ******')
        for idx in range(MAX_ITER):
            if idx % SAVE_STEP == 0:
                print('[*] Runing meta-test, load model for ' + str(idx) + ' iterations')
                test_command = process_test_command(idx, base_command)
                os.system(test_command)
a7b23 commented 5 years ago

Hi, thanks for the quick reply. My confusion seems to have been cleared. Another question - I only managed to get 1 shot accuracy of 57-58 for the mini-Imagenet dataset. I did the pre-training phase again instead of downloading the weights form the google drive link.

Do I need to change the hyperparameters for the pre-training phase to get the results mentioned in the repo?

yaoyao-liu commented 5 years ago

I use exactly the same hyperparameters to run pre-train for the uploaded models. Just one operation is not included in the repository: I use horizontal flipping to augment the data for pre-train phase. Maybe you could try this. The detailed setting for augmentation can be found in the related paper and its supplementary materials.

In our experiments, we find that the quality of pre-train model has a significant influence on meta-test accuracy. You could run the pre-train phase for several times and select the model with a good validation accuracy. You may also try to change the pre-train iteration number to 8k or 12k to find a better model.

I hope the aforementioned suggestions would be helpful for you to reproduce the results. If you have any further questions, feel free to ask me, and I'll try to reply asap.