internetarchive / iiif

The official Internet Archive IIIF service
GNU General Public License v3.0
21 stars 4 forks source link

Update v2 code to use Canteloupe and BookReader #16

Open mprove opened 11 months ago

mprove commented 11 months ago

For the item https://archive.org/details/opencontext-41-nippur-excavation-units

Same for https://archive.org/details/dr_sheet-1-2-bildplanskizze-der-stadt-kiew-120-000-10839003 ok: https://iiif.archive.org/iiif/dr_sheet-1-2-bildplanskizze-der-stadt-kiew-120-000-10839003/manifest.json not found: https://iiif.archive.org/iiif/2/dr_sheet-1-2-bildplanskizze-der-stadt-kiew-120-000-10839003/manifest.json

glenrobson commented 11 months ago

Thanks @mprove. We will look into this but its possibly to do with our routing in the new version.

digitaldogsbody commented 10 months ago

The first of these examples is loading fine for me running the code locally, but not on the deployed main branch code on ux-fnf-misc so I'm at a bit of a loss. We might need the error logs from that service to work out what is going on.

The second one is erroring because the image is too large and PIL is throwing a warning about a potential decompression bomb. It seems like the old v2 code still has a call that ends up downloading and parsing the image in order to get the dimensions: https://github.com/internetarchive/iiif/blob/main/iiify/resolver.py#L208

This eventually leads to PIL.Image.DecompressionBombError: Image size (596826944 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

I think it's probably out of scope for us to fix this right now as our work is focused on the v3 APIs, but if we have some time later we could look at replacing that web.info() call either with a call to Cantaloupe, or with using the dimensions from the BookReader API instead.

mprove commented 10 months ago

ok. Prio & severity should be lowest. Won't fix is fine with me as well.

glenrobson commented 6 months ago

Call Cantaloupe or book reader to get dimensions rather than downloading the images.