kjunelee / MetaOptNet

Meta-Learning with Differentiable Convex Optimization (CVPR 2019 Oral)
Apache License 2.0
524 stars 97 forks source link

ResNet-12 channels is different to TADAM #26

Closed yinboc closed 5 years ago

yinboc commented 5 years ago

Hello,

Thanks for your impressive work and sharing the code.

I am having a question about the ResNet-12 structure, that in the paper

We use a ResNet-12 network following [20, 18] in our experiments

However, in TADAM[20], they said

The number of filters for the first ResNet block was set to 64 and it was doubled after each max-pool block

That is, TADAM's ResNet-12 channels are 64,128,256,512, while in this code it is 64,160,320,640 channels per layer.

Because I am considering to study few-shot, and I got really confused about which backbone I should choose for a fair comparison. Therefore, I am wondering if my understanding is correct, and what leads to the designing choice of model?

Thank you!

kjunelee commented 5 years ago

Hi,

Thanks for your interest in our work and your ProtoNet code!

The architecture of our work mostly follows the design philosophy of TADAM, e.g., no pooling before the first residual block. However, there are minor differences like the insertion of Dropout, activation function, and number of filters.

The weird number of channels is inspired by the design of Wide ResNet (WRN) where they use 160, 320, 640 filters for residual blocks. WRN paper shows that by increasing the number of filters one can achieve high accuracy In image classification without using too deep networks. I think the excellent result by LEO is partially due to the adoption of 28-layer WRN architecture.

In few-shot learning many papers try different architectures, I would say it is fair if you show the improvement over baseline on some standard architectures like 4-layer ConvNets with 64 filters or 12-layer ResNets (like Table 3 in our paper).

yinboc commented 5 years ago

Ok, that makes sense. Thanks for the reply!