Closed hgoyal5 closed 4 years ago
@hgoyal5 ,
Fixed this legacy issue. Both a=0.5 and a=1.0 should have similar performance.
Training recipe for others: (1) For CIFAR-FS and FC100: --epochs 90 --lr_decay_epochs 45,60,75 (2) For tieredImageNet: --epochs 60 --lr_decay_epochs 30,40,50
Other parameters are the same as miniImageNet.
So according to the paper the weight for distillation and classification loss is 0.5 and 0.5 but in the script provided it is 1 and 0.5 . Which one is better?
For tiered imagenet, Cifar-fs and fc100 datasets are the batch size kept same as 64?
Could you provide the scripts for the datasets other than mini-imagenet?