garyfeng / google-images-download

Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
MIT License
1 stars 0 forks source link

New feature: drop images that are too narrow or too tall #11

Closed garyfeng closed 2 years ago

garyfeng commented 2 years ago

As the user I want to add an option to limit the download to only images that are neither too wide nor too tall. The option --aspect_ratio_threshold takes a float number between 0-1 as value. For example, --aspect_ratio_threshold 0.5 means that we only keep images where min(h,w)/max(h,w)>=0.5, where h,w and the height and width of the image. In this case, a 240x500 or 800x300 image would be too wide/tall to keep, but a 250x500 image will be kept.

Rational -- a downstream step will generate OOM error when the image is too narrow or too tall. The threshold happens to be 0.5. Rather than having an extra step to filter out these images, we can add this as a download requirement.

garyfeng commented 2 years ago

This does it

googleimagesdownload -k "swimming free style" -oc "male swimmer" -l 50 --extract_metadata --coco_metadata --type photo --size ">640*480" --format jpg --aspect_ratio_threshold 0.5 --output_directory "E:\swimpose" --image_directory "square"

and the output will contain

Deleting Image, aspect ratio 0.56 ====> 20.n2kia8vhwhbeiuyyzb0u.jpg
Deleting Image, aspect ratio 0.56 ====> 22.m1m2018c016.jpg