saftle / getty_images_thumbnail_scraper

Gettyimages Scraper perfect for AI Training
MIT License
8 stars 3 forks source link

wrong images scraped, getty images probably doing it intentionally #1

Open lolxdmainkaisemaanlu opened 1 year ago

lolxdmainkaisemaanlu commented 1 year ago

i just put ' a woman with glasses ' and it scrapes random unrelated pictures of vases, literal glass of water etc.

But when I manually go on their website and search for the same, I get appropriate results. I think they are intentionally doing it to prevent scraping idk

Janca commented 1 year ago

It may be because the URL used to index the images have a query tag called numberofpeople=none perhaps removing that would work for you

f"https://www.gettyimages.com/photos/{search_term}?assettype=image&license=rf&alloweduse=availableforalluses&family=creative&phrase={search_term}&sort=mostpopular&numberofpeople=none&page={page_number}"