piotrantosz / google-arts-crawler

Google Arts & Culture high quality image downloader
GNU General Public License v3.0
100 stars 18 forks source link

Not downloading images #2

Open zazencodes opened 5 years ago

zazencodes commented 5 years ago

I cannot get the script working. Here's the output I get:

=====================================
=== Google Arts & Culture crawler ===
=====================================
Provide image URL
sample url: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw
> URL: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw
=====================================
Provide image maximum SIZE
sample size: 12000 (recommended)
> SIZE: 12000
=====================================
> Opening website
> Downloading partial images..
> Downloaded 0 partial images
> Saving partial images as final image
FAILED
integer division or modulo by zero

As you can see, no images are downloaded. Looking at the chromedriver window I don't see any images on the screen, is that expected or not?

What version of chromedriver are you using and can you confirm this script still works for you?

zazencodes commented 5 years ago

Sometimes I have a bit more luck and instead get an error message like this

> Downloading partial images..
FAILED
cannot identify image file 'blobs/17.jpg'

I am yet to get it working

piotrantosz commented 5 years ago

Hey Alexander,

It's really strange. Looks like image is not loaded. I will try to fix it soon. Probably we need one more IF statement if images >0

sob., 16 lut 2019, 20:15 użytkownik Alexander Galea < notifications@github.com> napisał:

I cannot get the script working. Here's the output I get:

===================================== === Google Arts & Culture crawler ===

Provide image URL sample url: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw

URL: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw

Provide image maximum SIZE sample size: 12000 (recommended) SIZE: 12000

Opening website Downloading partial images.. Downloaded 0 partial images Saving partial images as final image FAILED integer division or modulo by zero

As you can see, no images are downloaded. Looking at the chromedriver window I don't see any images on the screen, is that expected or not?

What version of chromedriver are you using and can you confirm this script still works for you?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Boquete/google-arts-crawler/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AKFabeuc76qjSjh3xmD-yuUj_F3E50wrks5vOFjIgaJpZM4a_JKG .

piotrantosz commented 5 years ago

'cannot identify error' sometimes occurs in slow networks. Will take a closer look at it.

zazencodes commented 5 years ago

Thanks for taking a look. When I get the "cannot identify image" error, I opened the blob file e.g. 17.jpg and turns out it's HTML with a 404 error:

404. That’s an error.

The requested URL was not found on this server.

Seems that some of the blobs are not loading properly

zazencodes commented 5 years ago

I got it working today :)

Maybe the issue above was due to slow network (either me or the google arts servers).

piotrantosz commented 5 years ago

Yep it looks like network speed is broking blobs. We can't really check if image was loaded, as it's really hard to parse. I guess the only option is adding exception - and retry downloading.

Thank you for your contribution.