internetarchive / iiif

The official Internet Archive IIIF service
GNU General Public License v3.0
21 stars 4 forks source link

Complex books that can contain other books #12

Open glenrobson opened 10 months ago

glenrobson commented 10 months ago

As noted on the IIIF call this currently isn't working:

https://archive.org/details/BollettiniEcn/

Seems to only show the first item:

https://iiif.archive.org/iiif/BollettiniEcn/manifest.json

digitaldogsbody commented 10 months ago

Complex item that also has things in subdirectories: https://iiif.archive.org/iiif/mareful/manifest.json

Mike to write the subdirectory thing up in more detail here

digitaldogsbody commented 8 months ago

So for the item above, there are multiple jp2 zipfiles both at the top level, and also in a subdirectory:

Root level: https://archive.org/download/mareful/Maariful-Quran-01%28Almodina.com%29_jp2.zip/ Subdir (new): https://archive.org/download/mareful/new/mareful-quran-01-new-edition_jp2.zip/

Cantaloupe is able to provide images (and info.json etc) for all of these, you just need to encode the directory path after the item name: Root level: https://iiif.archive.org/image/iiif/3/mareful%2fMaariful-Quran-03(Almodina.com)_jp2.zip%2FMaariful-Quran-03(Almodina.com)_jp2%2fMaariful-Quran-03(Almodina.com)_0013.jp2/info.json Subdir: https://iiif.archive.org/image/iiif/3/mareful%2fnew%2Fmareful-quran-08-new-edition_jp2.zip%2Fmareful-quran-08-new-edition_jp2%2Fmareful-quran-08-new-edition_0921.jp2/info.json

So in theory it's not a problem for us to get these items into a manifest, but probably a question of approach. My initial thinking is that the logic for triggering is something like: if mediatype == 'texts' and count_of_originals > 1, and that we could consider returning a Collection instead of a manifest, with one manifest for each item. However, this means we will need to adapt the manifest generation code to handle multiple manifests per item, which might be awkward.

The other option could be to have the same triggering logic, and instead produce a single manifest with every page (for this item it will be monstrously big!!) and add Range objects for each individual item within.

Thoughts very much welcome!

glenrobson commented 5 months ago

This seems like a collection. Will need to sort something as it will be /manifest.json, maybe we could forward to collection.json.

glenrobson commented 5 months ago

Maybe add it to isCollection code.

glenrobson commented 4 months ago

Discussed this today and Sara and Ben convinced Glen that it should be a single manifest with a structure because there is no metadata for each individual item metadata.

The test to see if this falls into this use case, is if there are multiple image zip files then include all the images in the manifest and a range per zip file.

digitaldogsbody commented 4 months ago

I think we can duplicate the metadata. The example item I am using (mareful) has 14,611 images across the various JP2 zipfiles. Do we really want to make a manifest with that many canvases? Will it work in any viewers without exploding?

glenrobson commented 4 months ago

Due to scale back to collections again. Copy all metadata use filename as label. Work on the seeAlso and rendering to only link to relevant files.

glenrobson commented 4 months ago

Multi Volume: https://iiif.io/api/cookbook/recipe/0030-multi-volume/

saracarl commented 3 months ago

We should also test this with the following item (of multiple pdfs and jpgs): https://archive.org/details/st-anthony-relics-01/