mlfoundations / open_clip

An open source implementation of CLIP.
Other
9.29k stars 923 forks source link

Redundancy check: Unnecessary centercrop in default validation preprocessing #782

Closed jere357 closed 6 months ago

jere357 commented 6 months ago

Code from readme.md:

model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-32', pretrained='laion2b_s34b_b79k')

This preprocess object contains 5 augmentations.

Compose(
    Resize(size=224, interpolation=bicubic, max_size=None, antialias=warn)
    CenterCrop(size=(224, 224))
    <function _convert_to_rgb at 0x7f0cc7605800>
    ToTensor()
    Normalize(mean=(0.48145466, 0.4578275, 0.40821073), std=(0.26862954, 0.26130258, 0.27577711))
)

I think that the CenterCrop call is redundant since the image is resized to 224 in the step before, why call CenterCrop(size=(224,224)) to an image that has been resized to 224x224 ?

The preprocess object gets constructed in the image_transform() function call in transform.py (around line 372) Default values are resize_mode = 'shortest'

Discussion welcome, I'd be down to make a PR fixing this if you decide it's worth fixing

rwightman commented 6 months ago

It's not redundant, if size of Resize is a scalar it resizes the smallest side to that value, so it's not square. Crop is needed.

jere357 commented 5 months ago

yeah, i misread the torch.Resize docs superbad + i was working with only square images, then had another redundant idea because the code was working as it should, keep up the excellent work!