rom1504 / img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
MIT License
3.71k stars 338 forks source link

Add support for resizing with fixed aspect ratio while fixing the largest image dimension #240

Closed gabrielilharco closed 1 year ago

rom1504 commented 1 year ago

Are you sure you want to use that ?

If resolution is 512 it means an image like 768x512 will be resized to 512x341, making it impossible to crop to 512x512 afterwards

While the current keep ratio would keep 768x512

gabrielilharco commented 1 year ago

I'm thinking this might be useful we if want to ensure our images are smaller than a certain file size. With resolution 512 and the new flag (along with --resize_only_if_bigger), we are guaranteeing that all images are at most 512x512, while before you could still get something larger than that.

Re. smaller images, that's still the case in other settings, right? E.g. with --resize_mode=keep_ratio and --resize_only_if_bigger there still can be images smaller than 512x512. Do the transforms in open_clip not handle this well?

rom1504 commented 1 year ago

could you add a test there https://github.com/rom1504/img2dataset/blob/main/tests/test_resizer.py#L16 ?