Incorrect Image Normalization in Training and Augmentation in Evaluation

The image transformation for ResNet50 in the repository code needs to be corrected. Proper normalization needs to be applied; this could significantly improve performance and lead to faster convergence with less than 30 epochs. Also, no flip or rotation augmentation should be used for evaluation. @Xrioen


    def transform(self, image):
        image = Image.fromarray(image)

        if self.is_train:    
            # Random flipping and rotations
            if random.random() > 0.5:
                image = TF.hflip(image)
            if random.random() > 0.5:
                image = TF.vflip(image)

            angle = random.choice([180, 90, 0, -90])
            image = TF.rotate(image, angle)

        # Convert to tensor
        image = TF.to_tensor(image)

        # Normalize using ImageNet mean and std
        image = TF.normalize(image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

        return image

Metric	BLEEP (bsz=128, accum=4)	BLEEP (bsz=128, accum=1)	BLEEP (bsz=128, accum=4) normalized
Mean correlation cells	0.8025	0.7149	0.8051
Max correlation	0.6810	0.6282	0.6978
Mean HEG	0.1630	0.1096	0.2012
Mean HVG	0.1657	0.0988	0.1879
Mean markers	0.2280	0.1158	0.2314

accum means gradient accumulation.

bowang-lab / BLEEP

Incorrect Image Normalization in Training and Augmentation in Evaluation #14