libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.81k stars 180 forks source link

Add ColorJitter #202

Open bordesf opened 2 years ago

bordesf commented 2 years ago

Hi everyone, I just wanted to share my implementation of ColorJitter which is very close to the one used in torchvision. As ColorJitter in torchvision.transforms, you can specify a float, or a (min,max) to sample the different ratio for brightness/contrast/saturation and hue.

I did simple tests to visualize the transformations between the Pytorch and FFCV version: ColorJitter (only hue) from torchvision.transforms: color_jitter_pytorch4 ColorJitter (only hue) from this pull request (FFCV): color_jitter_fcv4

And ColorJitter with different values for brightness/contrast/saturation and hue: FFCV ColorJitter (2)

The code used for brightness/contrast/saturation is identical to the ones used in torchvision.transforms, however concerning hue, the code used is an adaptation from https://sanje2v.wordpress.com/2021/01/11/accelerating-data-transforms/ https://stackoverflow.com/questions/8507885

lengstrom commented 2 years ago

ColorJitter from PyTorch includes support for both grayscale and RGB images, is it possible you could add this feature before we merge it in?

bordesf commented 2 years ago

I would love to ! However it's not obvious how to check the number of channel in the compiled function. I tried several things but still got the following errors when trying to check the shape: No implementation of function Function(<function runtime_broadcast_assert_shapes at 0x7f6657fbc670>). I am not super familiar with Numba but ideally, we should compiled one function for RGB and another one for Grayscale, but we will not have the shape until the call to the compiled function. I looked at the other transforms in this repo, none of them has used a if condition over the shape. The simplest thing will be to add an argument in the init of the class where the user could switch between RGB and Grayscale mode but, it's not a great solution... Maybe the best case will be to have access to the decoder attributes inside the transformations. But, again, not sure if this is good idea.

netw0rkf10w commented 1 year ago

Hi @bordesf. Could you tell how your PR is compared to this one: https://github.com/libffcv/ffcv/pull/162?