The image transformation for ResNet50 in the repository code needs to be corrected. Proper normalization needs to be applied; this could significantly improve performance and lead to faster convergence with less than 30 epochs. Also, no flip or rotation augmentation should be used for evaluation. @Xrioen
def transform(self, image):
image = Image.fromarray(image)
if self.is_train:
# Random flipping and rotations
if random.random() > 0.5:
image = TF.hflip(image)
if random.random() > 0.5:
image = TF.vflip(image)
angle = random.choice([180, 90, 0, -90])
image = TF.rotate(image, angle)
# Convert to tensor
image = TF.to_tensor(image)
# Normalize using ImageNet mean and std
image = TF.normalize(image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
return image
The image transformation for ResNet50 in the repository code needs to be corrected. Proper normalization needs to be applied; this could significantly improve performance and lead to faster convergence with less than 30 epochs. Also, no flip or rotation augmentation should be used for evaluation. @Xrioen
accum means gradient accumulation.