Open Lyken17 opened 5 years ago
The model is trained and tested using Gluon MXNet. The main difference is the data pipeline. They use opencv for load and preprocess images.
I think they did same processing for validation. In pytorch, the transform is
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
datasets.ImageFolder(valdir, transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize,
])
For gluoncv, the transform is
normalize = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
crop_ratio = opt.crop_ratio if opt.crop_ratio > 0 else 0.875
resize = int(math.ceil(input_size / crop_ratio))
transform_test = transforms.Compose([
transforms.Resize(resize, keep_ratio=True),
transforms.CenterCrop(input_size),
transforms.ToTensor(),
normalize
])
when loading the same image and resizing it using bilinear mode, PIL and opencv won't give the same output.
When I load using
model = gcv.models.resnet50(pretrained=True)
and test forwarding, errorRuntimeError: size mismatch, m1: [1 x 991232], m2: [2048 x 1000]
raises. I think there should be something wrong with stride / downsampling.After a quick look into the code, I thought the cause might be the dilation. So I turned off the dilation using
model = gcv.models.resnet50(pretrained=True, dilated=False)
. This time model forwards without error, however, does not reach comparable performance as GluonCV claims.