Question about the pytorch implementation

yaoyao-liu / meta-transfer-learning

TensorFlow and PyTorch implementation of "Meta-Transfer Learning for Few-Shot Learning" (CVPR2019)

https://lyy.mpi-inf.mpg.de/mtl/

MIT License

731 stars 147 forks source link

Question about the pytorch implementation #18

Closed bellos1203 closed 4 years ago

bellos1203 commented 4 years ago

Hello, thanks for your marvelous work and the released code for the community.

I have run your pytorch-version code, and I got results for the mini-ImageNet as follows : 5way-1shot : 61.22 +- 0.0089 5way-5shot : 78.05 +- 0.0059

It's a bit higher than the results that you reported in the paper and this repo. Is the result affected by the pre-trained model with the random seed(I did not try with the different seed yet), or is there any bug in the pytorch version?

Thanks in advance.

yaoyao-liu commented 4 years ago

Thanks for your interest in our work.

The default network architecture for the PyTorch implementation is a 25-layer ResNet, which is deeper than the network architecture (ResNet-12) used in the original paper and the TensorFlow implementation. So the performance is a little higher.

eezywu commented 4 years ago

@yaoyao-liu Hi, I used ResNet12 described in the paper as the network architecture, and got results for the mini-ImageNet as follows: 5-way 1-shot: 56.56 +- 0.87 5-way 5-shot: 72.95 +- 0.64 , which are quite lower than the results reported in the paper. Would you also provide the ResNet12 version of code? Thanks!

yaoyao-liu commented 4 years ago

Hi @eezywu,

Thanks for your interest in our work. The ResNet-12 implementation is provided in the TensorFlow version (https://github.com/yaoyao-liu/meta-transfer-learning/tree/master/tensorflow).

If you'd like to run ResNet-12 with the PyTorch version, you may use the ResNet-12 provided in MetaOptNet (https://github.com/kjunelee/MetaOptNet).

If you have any further questions, feel free to email me, or leave additional comments on this issue.

bellos1203 commented 4 years ago

Sorry for my late reply. I didn't notice that the architecture is different! After I re-implemented the ResNet12 architecture, I got the following results: mini-ImageNet, 5-way 5-shot: 75.36 +- 0.61

I think the difference comes from the difference between the frameworks, pytorch, and tensorflow or some randomness. By the way, I think the inner-loop, code from line number 166 to 169 in 'trainer/meta.py', should be modified since the loss is accumulated sequentially among the task_batches, rather than in a parallel manner.

Thank you for your response again.

yaoyao-liu commented 4 years ago

Hi @bellos1203,

Thanks for reporting your results.

The implementation of the task batches has not been added to the PyTorch version. I'll fix this when I have time.

If you have any further questions on our work, feel free to add more comments, or create a new issue.