mrkshllr / FewTURE

Official implementation of the paper "Rethinking Generalization in Few-Shot Classification" [NeurIPS 2022]
Other
48 stars 7 forks source link

Uable to reproduce the performance reported in the paper. #5

Open caoql98 opened 1 year ago

caoql98 commented 1 year ago

Thanks for your outstanding work! I have utilized the pre-trained weights and the configs you mentioned in the README (--epochs 100,--optim_steps_online 5) to Meta Fine-tuning and Evaluate FewTure. However, I only obtain Test Acc 66.8844 +- 0.8728 for 1shot 5way and Test Acc 81.9867 +- 0.5830 for 5shot 5way in miniImagenet. Do you have utilized other tricks? Thanks for your reply!

mrkshllr commented 1 year ago

Hi @caoql98, I'll take a look whether there's been a mixup with the uploaded checkpoint, since the results you're getting seem quite a bit off. --> I assume you've tried the ViT one?

mrkshllr commented 1 year ago

Additional question: What are you obtaining regarding validation acc when fine-tuning?

caoql98 commented 1 year ago

Thanks for your suggestions, I have tried the VIT-small on the miniImagenet, the validation ACC for 1shot 5way is 72.69+-0.9330 and for 5shot 5way is 84.27 +- 0.5744

caoql98 commented 1 year ago

By the way, by transferring --optim_steps_online from 5 to 15,there is an improvement, I could obtain Test Acc 67.3822 + 0.8750 for 1shot 5way in miniImagenet. Maybe you provide the wrong hyperparameters?

caoql98 commented 1 year ago

Additionally, according to the train_metatrain_FewTURE.py, the --epochs in the readme file need to rewrite as --num_epochs. The --chkpt_epoch 1599 should be --chkpt_epoch 1600 since the provided weights are xxxx1600.pth.

mrkshllr commented 1 year ago

By the way, by transferring --optim_steps_online from 5 to 15,there is an improvement, I could obtain Test Acc 67.3822 + 0.8750 for 1shot 5way in miniImagenet. Maybe you provide the wrong hyperparameters?

The results in the paper are obtained with 15 optim steps, and you might even be able to achieve higher accuracies if those are increased (total performance wasn't our primary goal, but rather the method ;) ) -> So that it helps in your experiments is already a good sign

mrkshllr commented 1 year ago

Additionally, according to the train_metatrain_FewTURE.py, the --epochs in the readme file need to rewrite as --num_epochs. The --chkpt_epoch 1599 should be --chkpt_epoch 1600 since the provided weights are xxxx1600.pth.

Please excuse the seemingly unclear way how we stated the instructions in the readme -> The given command wasn't meant to reproduce the exact result of the paper, but rather to show how our code is used (for the provided case with 5 steps) -> The checkpoint epoch name stems from the actually trained checkpoints you'll obtain when using our self-supervised pretraining script where we start counting our epochs from '0', i.e. 1599 will be 1600 epochs. I've simply renamed the uploaded checkpoints to '1600' to better reflect that the model has indeed been trained for 1600 epochs, and you are correct that the loading command has to be adapted accordingly.

mrkshllr commented 1 year ago

Have you had a chance to repeat the finetuning for 5-way 5-shot with 15 or 20 steps?

caoql98 commented 1 year ago

Yes, I am attempting to do that, once I obtain the result, I will tell you! Thanks for your patient reply!

mrkshllr commented 1 year ago

No worries! -> I've updated the Readme instructions regarding the meta-finetuning, thanks for pointing these out! (especially the 'epochs' vs 'num_epochs' argument)

caoql98 commented 1 year ago

by transferring --optim_steps_online from 20 and 25, I could obtain test Acc 67.88+-0.9698 and test Acc 67.8978 + 0.8755for 1shot 5way in miniImagenet. Meanwhile, under 15 and 20 settings. for 5-way 5-shot, the model respectively obtains Test Acc 82.4933 +- 0.5769 Test Acc 82.6489 +-0.5704. So I think 15 is still not an appropriate option for the model. Moreover, the performance for the 5-way 5-shot still has an obvious gap. Could you help me figure this out?

mrkshllr commented 1 year ago

I'll take a closer look tomorrow and see if I can find the logs; In the meantime, you could try lowering the similarity temp slightly: similarity_temp_init = 0.0421 Or activate the meta-learning of it

caoql98 commented 1 year ago

Thanks for your advice. I would further try lowering the similarity temp slightly: similarity_temp_init = 0.0421. Yesterday, I have also tried the swin-tiny with --optim_steps_online 20 in miniImagenet. However, I only obtain Test Acc 70.6867 + 0.8323 for 5way 1shot setting and 85.0622 + 0.5439 for 5way 5shot setting. There is still an obvious gap. By the way, what do you mean activating the meta-learning of it? Do you mean lowering the similarity temp slightly and activating meta-learning during training and testing?

mrkshllr commented 1 year ago

Hi @caoql98, I've run 2 more meta-finetuning runs with ViT and 15 optimisation steps on miniImageNet -- one using the default temperature, and one with the lowered one I've mentioned earlier; I also varied the epochs of meta-ft a bit; => I noticed one small bug in the code that must have occurred when finalising it for github: The Tmax of the lr_scheduler is epoch independent, however should be dependent (which basically affects the learning rate for the cosine schedule): T_max=50 * args.num_episodes_per_epoch -> T_max=args.epochs * args.num_episodes_per_epoch

I ran the following settings to provide some insight, and got these results:

In the paper, we report 84.05 for the 15-step setting, so both are within the error-margin and the slightly lower temp even slightly improved upon that -- your results should thus end up somewhere around those values as well;

P.S.: Please re-download the ViT checkpoint, I've uploaded the local checkpoint I've been using for these runs! vit_checkpoint

I'll update the readme and code as soon as I can; Let me know if you have any further issues/queries, also feel free to drop me an email for a more in-depth discussion/analysis in case problems persist!

(Regarding your question of meta-learning the temp: You can learn the scaling temperature for the inner-loop steps within the outer loop, and our code supports that -> Some analysis is provided in the supplementary material of the paper)

caoql98 commented 1 year ago

Thanks for your reply! I could obtain very similar results for your settings. Particularly, default temp: Test acc: 83.7378 + 0.5591 (meta-ft with 100epochs) 83.9978 + 0.5322 (20 optimization steps). However, in your paper, you report performance is 84.51 ±0.53 for 5-shot in Imagenet. Would it be able to reproduce? By the way, you provide another checkpoint for ViT. Do we need other new checkpoints (for instance, for Swin) to produce the same results?

caoql98 commented 1 year ago

Hi, @mrkshllr, after fixing the bug, I have leveraged the provided pertained weights to run two more experiments with Swin in mininImagenet and I got val acc: 73.88+-0.8627, test acc: 70.1711 + -0.8495 for 1-shot setting, val acc: 85.82+-0.5170, test acc: Test Acc 84.9178 +- 0.5394 for 5-shot setting. Clearly, there is still a huge performance gap. Are there any other bugs? or Do we need to self-produce the Self-Supervised Pretrained Models?

WuJi1 commented 1 year ago

I also can't achieve same results for swin-tiny in miniImagenet. As for me, I can get test acc 70.77 for 5-way 1-shot setting. And I train the self-supervised pretrained model but get no imporve. What should we do?