Dataloader crashes when using torchvision style custom transform (random brightness)

libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)

https://ffcv.io

Apache License 2.0

2.82k stars 178 forks source link

Dataloader crashes when using torchvision style custom transform (random brightness) #149

Closed Jasonlee1995 closed 2 years ago

Jasonlee1995 commented 2 years ago

Following the ffcv doc, I made random bright augmentation using PIL Image reference using keras and pytorch augmentation. (torchvision transform style) (keras, pytorch)

I make pipeline with ffcv transformation and also the above augmentation and call dataloader.

I got the above error and tried a lot, but can't find how to solve it. (I checked ImageDecoder output is numpy array)

I'm currently using ffcv 0.0.3 version

andrewilyas commented 2 years ago

Hi @Jasonlee1995 ! Torchvision augmentations operate on Tensors and not numpy arrays, so you should move the augmentations to come after the "ToTensor" operation in your pipelines.

Jasonlee1995 commented 2 years ago

@andrewilyas

Thanks to your advice, I change my augmentation operation to tensor not numpy and change the pipeline order. (reference pytorch codes)

But what I get is error about wrong shape and check the shape.

I know that pytorch works on individual image not batch images, which mean printed shape should be (C, H, W) or (H, W, C). (cause I add print code on augmentation code)

But what I got is (batch size, H, W, C) and I think it's strange.

If I use my own augmentation operation (torchvision style) and want my operation to work on ffcv pipeline, I have to assume that batch images are the input?

GuillaumeLeclerc commented 2 years ago

I think you are Missing a ToTorchImage before you start applying any pytorch augmentation (this will put the order of the channels in the order pytorch expects)

PS: torch augmentations are much slower than FFCV ones. Having this augmentation will likely completely ruin your performance. I strongly recommend you write it as an FFCV augmentation instead. Brightness/contrast is really simple, it is just new_image = clip(a * old_image + b, 0, 1) for some a and b. This doesn't cost much to do but will radically improve your performance

Jasonlee1995 commented 2 years ago

Thanks! I find the ways not using PIL Image, but use numpy to augment the images using brightness, contrast, saturation. As soon as I test my FFCV transformations, I'll send pull request!