lovasoa / dezoomify

Dezoomify is a web application to download zoomable images from museum websites, image galleries, and map viewers. Many different zoomable image technologies are supported.
https://dezoomify.ophir.dev
GNU General Public License v2.0
675 stars 75 forks source link

support for zoomify images without a metadata file (philageohistory.org) #321

Open mikeczabator opened 5 years ago

mikeczabator commented 5 years ago

Site name and desciption

philageohistory.org

picture is here : https://www.philageohistory.org/rdic-images/view-image.cfm/ARW1873.ChesterCounty.051.Plate47_UpperUwchlanAndUwchlan

Example URLs

picture is here. I cannot find any underlying metadata (.xml) for this within the Network tab on Chrome.
https://www.philageohistory.org/rdic-images/view-image.cfm/ARW1873.ChesterCounty.051.Plate47_UpperUwchlanAndUwchlan

The first image in the series is here: https://www.philageohistory.org/rdic-images/common/get-tile.cfm/ARW1873.ChesterCounty.004.Index/TileGroup0/5-0-0.jpg , but I cannot get it to work.

mikeczabator commented 5 years ago

note, i tried this with the web UI, and the python tool. No luck with either. Note, the python is my preferred method.

lovasoa commented 5 years ago

It uses a dezoomify viewer without a meta-data file (the meta-data are included directly in the viewer webpgage). Since this image is small enough, you can use the generic dezoomer for it, using the following url:

https://www.philageohistory.org/rdic-images/common/get-tile.cfm/ARW1873.ChesterCounty.051.Plate47_UpperUwchlanAndUwchlan/TileGroup2/5-{{X}}-{{Y}}.jpg

If you prefer command-line tools, there is also a generic dezoomer in dezoomify-rs.

canvas

However, this is not guaranteed to work for all images from this site, because of how zoomify organizes tiles. All tiles for a given zoom level or not necessarily in the same TileGroup. So I am going to leave this issue open.

mikeczabator commented 5 years ago

thank you! just curious, how does a site like this work without the *.xml file ?

lovasoa commented 5 years ago

They include the meta-information directly in the source code of the page. If you look at the page source, you'll find :

var imgWidth = 6212;
var imgHeight = 6248;
var url = '/rdic-images/common/get-tile.cfm/ARW1873.ChesterCounty.051.Plate47_UpperUwchlanAndUwchlan/';

They then use openlayers to render the tiles.

mikeczabator commented 5 years ago

got it. thanks!