pretrained Res12 and WRN performance

Sha-Lab / FEAT

The code repository for "Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions"

MIT License

418 stars 84 forks source link

pretrained Res12 and WRN performance #30

Closed chmxu closed 4 years ago

chmxu commented 4 years ago

Hi, thanks for this work. I have tried to use the pretrained weights directly in a prototypical network on 1-shot miniImageNet. The result is about 59% when using Res12 and 55% when using WRN. I wonder if this is reasonable since WRN is a deeper network compared to Res12.

chmxu commented 4 years ago

By the way it's hard to reproduce the miniimagenet result for WRN (~65%) following released weights and training script for Res12, only getting about 61%. Should I change any hyper-parameters?

Han-Jia commented 4 years ago

In our experiments, the ResNet-12 (from MetaOptNet) indeed works better than WRN (and even a variant of ResNet-18).

To train a model with WRN, a trick is to set fix_BN to True (set BN to eval mode) during the meta-learning stage, which improves a lot. For ProtoNet, the temperature is important, which should be large, e.g., 64.

chmxu commented 4 years ago

In our experiments, the ResNet-12 (from MetaOptNet) indeed works better than WRN (and even a variant of ResNet-18).

To train a model with WRN, a trick is to set fix_BN to True (set BN to eval mode) during the meta-learning stage, which improves a lot. For ProtoNet, the temperature is important, which should be large, e.g., 64.

I have tried the fixBN trick. It does improve the performance on WRN. Quiet interesting. Thanks for help!