libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.83k stars 178 forks source link

Regarding color jitter #112

Closed saarthak-kapse closed 2 years ago

saarthak-kapse commented 2 years ago

Hello,

Thanks a lot for releasing this amazing library, looking forward to use it in my pipelines.

I am able to make my model converge when I am using randomresizergb crop, flip, random grayscale from torchvision, and gaussian blur from torch vision. But whenever I am using color jitter my model is diverging. Is there some bug in using the color jitter from data encoded in beton format? Because when I was using dataloader from pytorch, with color jitter my model was able to converge.

GuillaumeLeclerc commented 2 years ago

Which Implementation of ColorJitter are you using. Did you look at the images generated by FFCV and Pytorch respectively ? Can you spot any difference ?

saarthak-kapse commented 2 years ago

Hello,

I am using torchvision color jitter. Visually there is no difference as such, but when treaining, the model is not being able to converge.

Currently I am trying this :-

`
data_loader = Loader(args.data_path, batch_size = args.batch_size_per_gpu, num_workers = args.num_workers, os_cache = True, distributed = True,

indices = list(np.arange(10000)),

                    batches_ahead = 1,
                    order = OrderOption.RANDOM,
                    pipelines = {
                      'image': [ffcv.fields.decoders.SimpleRGBImageDecoder()],
                      'image_teacher': [ffcv.fields.decoders.SimpleRGBImageDecoder()]
                    })

`

Followed by :-

` def apply_along_batch(func, arr): res = map(func, arr) res = list(map(lambda x: x.unsqueeze(0), res))
return torch.cat(res[:])

class DataAugmentationDINO_new(object): def init(self, input_image_size=224, magnification=4):

    color_jitter = transforms.ColorJitter(0.8, 0.8, 0.8, 0.2)

    self.global_transfo = albumentations.Compose(
        [
            albumentations.RandomResizedCrop(input_image_size, input_image_size),
            albumentations.HorizontalFlip(),
            albumentations.ColorJitter(0.8, 0.8, 0.8, 0.2, p=0.8),
            albumentations.ToGray(p=0.2),
            albumentations.GaussianBlur(blur_limit = int(0.06 * input_image_size), sigma_limit=(0.1,2), p=0.5),
            # albumentations.Normalize(mean=(0,0,0), std=(1,1,1)),
            ToTensorV2()
        ],
    )

    print('album_transform loaded')

def __call__(self, image):

    crops = []

    crops.append(apply_along_batch(func = lambda x: self.global_transfo(image=x)['image'], arr = image.astype(np.float32)/255.0))  #albumentations
    crops.append(apply_along_batch(func = lambda x: self.global_transfo(image=x)['image'], arr = image.astype(np.float32)/255.0))  #albumentations

    return crops
 `

With this approach I am able to make the model converge, but the data processing is slow because of the apply_map function. It would be wonderful that with time you introduce more and more augmentations similar to albumentations. Looking forward to adapt to this in future once these augmentations are released. Currently not much useful before of this torchvision color jitter bug.

Thanks a lot!

GuillaumeLeclerc commented 2 years ago

Can you show the version that doesn't work ?

GuillaumeLeclerc commented 2 years ago

@saarthak02 My guess is that you are applying ColorJitter on 0-255 value range instead of 0-1. You should simply scale your data (or better write your own fast FFCV Color Jitter that works directly on 0-255)

(Feel free to re-open if it wasn't the issue)