hardikvasa / google-images-download

Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
MIT License
8.57k stars 2.11k forks source link

Google limit? #243

Open manubider opened 5 years ago

manubider commented 5 years ago

Hey! I'm getting a limit of 675 photos when downloading from different keyworkds, what could this be? Does google have a limit around that number? I want to download at least 10000 to train a neural network.

YueLiao commented 5 years ago

I met the same problem.

ablacklama commented 5 years ago

Until there's a fix, i'd suggest trying to offset keyword. You could even run a loop that calls this function many times and offsets by and additional 675 each time. I haven't tried this though, but i'd be interested to see if it works

AmineHosni commented 5 years ago

@ablacklama Unfortunately all 5000 could not be downloaded because some images were not downloadable. 814 is all we got for this search filter! It couldn't give me a single additional photo..

refandhika commented 5 years ago

I was wondering why i only got 397 photo for 500 limit on dog husky keyword. So i checked how image search on google page behavior was. It's actually have a button that need to be clicked when it reach 397 images. Maybe this behavior have not been included on latest script, hence the crawler stopped earlier than the real image limit was.

Also, since google image itself is not an unlimited repository of images, be mindful you could reach the last image for those keyword before reaching your defined limit.

ozcelikkale commented 4 years ago

can you solve the problem??

glenn-jocher commented 4 years ago

I believe that the search simply reaches the end of the allowable scrolling range for a google/bing image search. I found this typically occurs after around 500-700 images. Since the selenium functionality of the scraper simply mimics a human using Chrome to conduct an image search manually, I don't believe a workaround is possible, other than possibly searching for slightly different search terms, and then removing duplicates in post-processing.

Screen Shot 2020-02-25 at 10 04 06 PM