Open jsbien opened 4 years ago
Hello, Dezoomify, by design, downloads only a single file at a time. If you want to automate the download of a large number of images, you should have a look at dezoomify-rs. It is a command-line tool (also developed by me) that you can integrate with other tools to build complex behaviors. For instance, you can solve your problem with the following command line :
curl "https://polona.pl/iiif/item/MTI2MzI0NjU/manifest.json" | jq -r ".items[].id" | xargs -n 1 dezoomify-rs -l
It uses curl to download the manifest, jq to extract the list of zoomable image URLs, and xargs to launch multiple instances of dezoomify-rs, each one downloading a single image.
This command line can be run in a bash shell. If you are using windows or MacOS, just run it in a terminal. In windows, you can use WSL
Hi, Can IIIF manifests from the British Library Endangered Archives Programme like https://eap.bl.uk/archive-file/EAP790-14-1/manifest be retrieved by the above-mentioned command line or they have to be formatted into the urls.txt for batch/bash scripts as you have mentioned here?
Hi, Yes, in a similar manner as the above, you can extract the list of URLs and then launch dezoomify-rs on each one. You just have to adapt the path inside the jq command to your case:
curl "https://eap.bl.uk/archive-file/EAP790-14-1/manifest" | jq -r '.sequences[].canvases[].images[].resource.service."@id" + "/info.json"' | xargs -n 1 dezoomify-rs -l
And if you want to avoid overwhelming their server with too many requests, you can add the following parameters :
dezoomify-rs -l --parallelism 1 --timeout 60s --retry-delay 10s
This will make the download slower, but more reliable.
Thanks a lot for the code.
I finally tried your suggestions, thank you very much! I will make some comments on the dezoomify-rs site.
Site name and desciption
Polish national digital library: https://polona.pl/
Example URLs
A sample manifest: https://polona.pl/iiif/item/MTI2MzI0NjU/manifest.json
Current error message
There is no problem with downloading a single page, but my goal is to download a multivolume dictionary of about 5 thousand pages. I had a look at https://github.com/intranda/goobi-iiif-downloader but have no idea how to use it.