lovasoa / dezoomify

Dezoomify is a web application to download zoomable images from museum websites, image galleries, and map viewers. Many different zoomable image technologies are supported.
https://dezoomify.ophir.dev
GNU General Public License v2.0
671 stars 75 forks source link

Issues with bulk download from Biblioteca Medicea Laurenziana #732

Open Svetlana-Yatsyk opened 1 year ago

Svetlana-Yatsyk commented 1 year ago

Hello everyone,

I am trying to download this manuscript from Biblioteca Medicea Laurenziana. I am using tiles' adresses (such as http://mss.bmlonline.it/fcgi-bin/iipsrv.fcgi?Zoomify=\\TECA-NAS\Teca\Plutei\Lotto06_SecondaParte\.\P003550_28sin.09\pyr\Plut._28__sin._09_0013.tif/TileGroup0/4-0-2.jpg).

I wrote the following script to download all the pages of this manuscript

import requests
import shutil

image_urls = [
    "http://mss.bmlonline.it/fcgi-bin/iipsrv.fcgi?Zoomify=\\TECA-NAS\Teca\Plutei\Lotto06_SecondaParte\.\P003550_28sin.09\pyr\Plut._28__sin._09_0001.tif/TileGroup0/4-0-2.jpg",
    "http://mss.bmlonline.it/fcgi-bin/iipsrv.fcgi?Zoomify=\\TECA-NAS\Teca\Plutei\Lotto06_SecondaParte\.\P003550_28sin.09\pyr\Plut._28__sin._09_0002.tif/TileGroup0/4-0-2.jpg",
    "http://mss.bmlonline.it/fcgi-bin/iipsrv.fcgi?Zoomify=\\TECA-NAS\Teca\Plutei\Lotto06_SecondaParte\.\P003550_28sin.09\pyr\Plut._28__sin._09_0003.tif/TileGroup0/4-0-2.jpg",
    # etc
]

output_folder = "/content/images"

for i, image_url in enumerate(image_urls):
    try:
        response = requests.get("https://dezoomify.ophir.dev/", params={"url": image_url})
        response_json = response.json()
        if "image" in response_json:
            download_url = response_json["image"]
            image_response = requests.get(download_url, stream=True)
            image_path = f"{output_folder}image_{i+1}.jpg"
            with open(image_path, "wb") as f:
                shutil.copyfileobj(image_response.raw, f)
            print(f"Image {i+1} downloaded successfully.")
        else:
            print(f"Error processing image {i+1}: {response_json['error']}")
    except Exception as e:
        print(f"Error downloading image {i+1}: {str(e)}")

Unfortunately, it does not work, I get error "Error downloading image 1: [Errno Expecting value]". I struggle to understand what is causing it. Could you please explain it to me?

I have very limited experience with dezoomify (as well as with python, heh), so all the comments will be highly appreciated.

lovasoa commented 1 year ago

If you want to script dezoomify, you should use dezoomify-rs