libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

CenterCropRGBImageDecoder vs torchvision.transforms.CenterCrop #341

Open meghbhalerao opened 11 months ago

meghbhalerao commented 11 months ago

Dear Authors, https://docs.ffcv.io/working_with_images.html#decoding-options link states that CenterCropRGBImageDecoder mimics torchvision.transforms.CenterCrop, but if we look at the docs of torchvision.transforms.CenterCrop here - https://pytorch.org/vision/stable/generated/torchvision.transforms.CenterCrop.html - it does not have a ratio argument that CenterCropRGBImageDecoder has, and from the explanation of the 2 transforms, I think that the CenterCrop transform is not exactly the same, i.e. ffcv seems to be doing a CenterCrop based on the ratio with respect to the smallest side of an image and then does an absolute resize, which can be different for different images, epically in the imagenet dataset where the dimensions of the images are not fixed, where as torchvision.transforms.CenterCrop does not take in a ratio parameter, and just does an absolute sized center crop, and according to the explanation given in the torch docs, if the crop size is greater than a particular dimension, then just zero padding is done to make the sizes compatible and then the crop is performed. I would be more than happy to implement a center crop which is exactly the same as pytorch as a pull request, please let me know if this is possible. Thank you for your time and please let me know if I am missing anything.

andrewilyas commented 10 months ago

Hi @meghbhalerao ! You're right that they do seem to be slightly different. We could try having an AbsoluteCenterCrop (or some other name) that mimics the torchvision one more closely, feel free to submit a pull request if you have time to write it!