internetarchive / iiif

The official Internet Archive IIIF service
GNU General Public License v3.0
22 stars 4 forks source link

invalid IA pages are part of the IIIF (v2) manifests #41

Closed mprove closed 1 year ago

mprove commented 1 year ago

IIIF viewers display pages that are skipped by IA's viewer, e.g. the first empty page before the book cover and the error scans with the hands in the following case:

compare https://archive.org/details/dieentstehungder00wege/mode/thumb with https://mprove.de/chrono?d=1&s=1&iiif-content=https://iiif.archive.org/iiif/2/dieentstehungder00wege/manifest.json

RFE: Either mark invalid pages as such in the manifest or skip the pages for the manifest at all or add real page numbers to manifest pages.

IA-InvalidPages

hadro commented 1 year ago

Hi @mprove, thanks for reporting -- we'll take a look but realistically this is a case where we highly recommend using the version 3 manifests rather than version 2; the logic for version 3 makes use of a different feed from the IA book reader which should take care of the issues you mention. E.g., if you look at the v3 manifest in Mirador, it mirrors the display in IA: https://projectmirador.org/embed/?iiif-content=https%3A%2F%2Fiiif.archive.org%2Fiiif%2F2%2Fdieentstehungder00wege%2Fmanifest.json

digitaldogsbody commented 1 year ago

Updating the old v2 code was out-of-scope for this initial partnership. You can see here that the code for generating the items is, as Josh said, using a very simple logic that just iterates based on the number of pages reported: https://github.com/internetarchive/iiif/blob/main/iiify/resolver.py#L279-L287

This also has the knock on effect that if there are several "incorrect" pages which are excluded in the bookreader, the equivalent number will be missing from the end of the v2 manifests, as they just end up having the first X images in the item (where X is reported by the bookreader API and thus correctly excludes all of these blank/error pages).

mprove commented 1 year ago

I've learned something. Thx. "Won't fix" is fine with me.

hadro commented 1 year ago

Thanks @mprove, will do

mprove commented 1 year ago

confirmed. v3 does not show invalid scans :: https://mprove.de/chrono?d=1&s=1&iiif-content=https://iiif.archive.org/iiif/dieentstehungder00wege/manifest.json