Closed chmxu closed 4 years ago
By the way it's hard to reproduce the miniimagenet result for WRN (~65%) following released weights and training script for Res12, only getting about 61%. Should I change any hyper-parameters?
In our experiments, the ResNet-12 (from MetaOptNet) indeed works better than WRN (and even a variant of ResNet-18).
To train a model with WRN, a trick is to set fix_BN to True (set BN to eval mode) during the meta-learning stage, which improves a lot. For ProtoNet, the temperature is important, which should be large, e.g., 64.
In our experiments, the ResNet-12 (from MetaOptNet) indeed works better than WRN (and even a variant of ResNet-18).
To train a model with WRN, a trick is to set fix_BN to True (set BN to eval mode) during the meta-learning stage, which improves a lot. For ProtoNet, the temperature is important, which should be large, e.g., 64.
I have tried the fixBN trick. It does improve the performance on WRN. Quiet interesting. Thanks for help!
Hi, thanks for this work. I have tried to use the pretrained weights directly in a prototypical network on 1-shot miniImageNet. The result is about 59% when using Res12 and 55% when using WRN. I wonder if this is reasonable since WRN is a deeper network compared to Res12.