Closed matheus-hertzog-deel closed 5 years ago
HI, @Matheusih Have you figure out imagenet version of wide-resnet?
@xwuShirley sadly, no. The author does provide the model object, however, here I tried loading and testing it on the tiny-imagenet dataset but I couldn't get anything close to a desirable performance.
@Matheusih Have you tried input size with 224x224? would it work?
see https://github.com/szagoruyko/wide-residual-networks/tree/master/pretrained for pretrained ImageNet WRN-50-2.
Also, support for WRN was added in torchvision via https://github.com/pytorch/vision/pull/822
To make BasicBlock WRN-18-2 do:
from torchvision.models.resnet import Bottleneck, ResNet, BasicBlock
model = ResNet(BasicBlock, [2,2,2,2], width_per_group=64 * 2)
To make Bottleneck WRN-50-2 do:
model = ResNet(Bottleneck, [3,4,6,3], width_per_group=64 * 2)
(this needs master torchvision)
Also, I plan adding pretrained WRN-18-2, WRN-34-2 and WRN-50-2.
Update: removed WRNBottleneck with modified expansion
@szagoruyko Should the [2,2,2,2] in the model of WRN-50-2 be [3, 4, 6, 3] same as resnet50, the following?
model = ResNet(WRNBottleneck, [3, 4, 6, 3], width_per_group=64 * 2)
similarly, the resnet-34-2 like following?
ResNet(BasicBlock, [3, 4, 6, 3], width_per_group=64 * 2)
very appreciate your help.
@xwuShirley yes you're right, it is [3, 4, 6, 4]
, fixed it in my post. default parameters for ResNet-50 from lua fb.resnet.torch or NVIDIA/apex should do for training.
there was a fix in https://github.com/pytorch/vision/pull/852 so that it is no longer needed to make new Bottleneck class with modified expansion.
pretrained models are now in pytorch hub, see https://pytorch.org/hub/pytorch_vision_wide_resnet/
In your wide-resnet.lua, line 95: "--one conv at the beginning (spatial size: 32x32)" I can see your model has input size of 32x32 (Cifar10/100 size) and later on performs an avg_pooling on an 8x8 input. How do you train your models on larger input datasets such as COCO and ImageNet? Do you just modify the avg pooling layer and the first convolutional layer?