IUBLibTech / newton_chymistry

New version of 'The Chymistry of Isaac Newton', using XProc pipelines to generate a website based on TEI XML encodings of Newton's alchemical manuscripts, and Apache Solr as a search engine.
2 stars 0 forks source link

Display page images #19

Closed Conal-Tuohy closed 5 years ago

Conal-Tuohy commented 5 years ago

Diplomatic and Normalized mss need to include a link to the page images (launches METS Navigator application) If possible suppress link to "Page Images" in Introductions since those are born-digital.

Conal-Tuohy commented 5 years ago

I think we need to consider some options, here.

The METS viewer is one way to bring the images into play, but as it stands, it's not particularly responsive, or mobile friendly (especially due to the popup), and it's missing branding and contextual links to the Chymistry site. However, something similar (using a similar Javascript and similar interactivity) could be reimplemented.

Another option would be to load the page images into a side panel, like in this example (which actually has two such panels because the TEI transcript is based on a typescript of a manuscript, both of which are imaged). http://bates.org.au/text/52-061T.html

Another option could be to deploy a IIIF viewer?

Conal-Tuohy commented 5 years ago

At yesterday's meeting it seemed there was a general agreement that adopting the IIIF framework, and being able to use a IIIF viewer (and specifically UniversalViewer) would be a good solution, if it could be made to work.

The major issue is that the IIIF viewers available are very image-centric, and though they are great for navigating and viewing sequences of page images, and even textual annotations on those images, they are weak when it comes to also accommodating a rich textual representation of the source texts, such as we have with the Chymistry TEI. NB the problem is not the IIIF conceptual model or the IIIF Presentation API itself (which does allow for images to be associated with "otherContent" such as transcriptions), but with the current state of the art in the viewers themselves. This is unfortunate, since it makes it unlikely that we could simply implement IIIF back end services, and deploy an "out of the box" IIIF viewer on top of those services; instead we would be forced to do some modification work on an IIIF viewer or some integration work at the UI level.

Reading around on the web yesterday I found a great deal of interest in the idea of presenting transcriptions in the context of IIIF, but little in the way of actual UIs which offered such a feature.

I will make some notes here about what I found and what I think might be the best path forward.

Conal-Tuohy commented 5 years ago

The UniversalViewer code repository has this open issue https://github.com/UniversalViewer/user-stories/issues/26

Conal-Tuohy commented 5 years ago

The 4science project has a "DSpace-GLAM" product which is a fork of UniversalViewer, modified to add an "OCR" panel to the right-hand side of the window.

https://dspace-glam.4science.it/explore?bitstream_id=1877&handle=1234/11&provider=iiif-image#?c=0&m=0&s=0&cv=4&xywh=-3299%2C-392%2C16846%2C7798

The OCR-generated text is synchronized with the page image navigation; selecting a different image causes the panel to display the textual content of that image.

At the IIIF API level, the OCR text is provided by the IIIF server in the form of a multitude of small (word-level) snippets which are each "annotations" of a particular region of the page. Typically an IIIF viewer would display those annotations as rectangular highlights on the image, and allow them to be selected and viewed individually, but here the "OCR" panel aggregates those snippets and displays them as a single block of text. I don't think this is an appropriate way to model the Chymistry texts, though, since it would not allow the text to be viewed as a unitary (scrolling) sequence, which I think is desirable.

A positive feature of 4Science's approach is that their UI is integrated with the viewer, with a consistent user experience; and being part of the viewer, it remains visible even when the viewer is switched into "full-screen" mode.

The codebase is here: https://github.com/4Science/universalviewer/tree/ocr

Conal-Tuohy commented 5 years ago

The Wellcome Library has an example of UniversalViewer driving the display of a textual transcription:

https://wellcomelibrary.org/moh/report/b18250464/4#?c=0&m=0&s=0&cv=4

In this example, the URL of the page includes (within the fragment identifier) a page number, and the web page itself includes the text of that page. That text is inserted into the page on the server side, possibly by aggregating word-level annotations retrieved from an IIIF server, as in the DSpace-GLAM viewer .

To enable the transcript text to update without having to refresh the entire page in the browser, a separate Javascript module is embedded in the page along with the UV, and the module installs an event listener which listens for uv.onCanvasIndexChanged events emitted by the UV. In response to that event, it updates the fragment identifier of the page's URL to reflect the new page number, and then loads that HTML page again, extracting the transcript portion and inserting it into the current page.

The use of an event listener means they haven't had to modify UV at all, which is a plus.

The roundabout procedure of extracting the text from an HTML page could be replaced with a procedure for reading the transcript from the IIIF manifest.

Conal-Tuohy commented 5 years ago

So one distinct possibility would be to have an HTML page which embeds a UV and the transcription, side by side. An event listener would scroll the transcription in response to UV image selection. Potentially another listener listening to scrolling events could also trigger the UV to load the appropriate image. Switching the UV to full screen would hide the transcription.

Conal-Tuohy commented 5 years ago

Via the mail-list, Michelle (with Randall's agreement) says

For now, I think we should proceed by rendering the TEI as HTML and render the mss images using the Universal Viewer (alongside the TEI/HTML).

So I think we are all on the same page.

Conal-Tuohy commented 5 years ago

Michelle also reports another example of a prototype viewer, from Raffaele Viglianti:

http://umd-mith.github.io/sga-lab/e11/#6

I like this, though I have a couple of quibbles:

Firstly, the page navigation controls are hidden when you put the IIIF viewer into its full screen mode. If we can use the page navigator in the IIIF viewer itself, then we could avoid this UX issue.

Secondly, I don't think the reference to the TEI transcription is modelled quite correctly in the IIIF manifest. e.g. here is one IIIF Canvas from their manifest at https://raw.githubusercontent.com/umd-mith/sga-lab/gh-pages/e11/manifests/ox-ms_shelley_adds_e11_vs.json

{
  "@id": "https://raw.githubusercontent.com/vscrimer/sga/master/data/tei/ox/ox-ms_shelley_adds_e11/ox-ms_shelley_adds_e11-0008.xml",
  "@type": "sc:Canvas",
  "label": "6",
  "height": 7110,
  "width": 5418,
  "thumbnail": "http://s3.amazonaws.com/sga-tiles/ox/ms_shelley_adds_e11/ms_shelley_adds_e11-0008/full/159,/0/default.jpg",
  "images": [
    {
      "@type": "oa:Annotation",
      "motivation": "sc:painting",
      "resource": {
        "@id": "http://s3.amazonaws.com/sga-tiles/ox/ms_shelley_adds_e11/ms_shelley_adds_e11-0008",
        "@type": "dctypes:Image",
        "format": "image/jpg",
        "height": 7110,
        "width": 5418,
        "service": {
          "@id": "http://s3.amazonaws.com/sga-tiles/ox/ms_shelley_adds_e11/ms_shelley_adds_e11-0008",
          "@context": "http://iiif.io/api/image/2/context.json",
          "profile": "http://iiif.io/api/image/2/profiles/level2.json"
        }
      }
    }
  ],
  "otherContent": []
}

They have used the identifier (@id property) of the Canvas https://raw.githubusercontent.com/vscrimer/sga/master/data/tei/ox/ox-ms_shelley_adds_e11/ox-ms_shelley_adds_e11-0008.xml as a place to record the URL of the TEI itself. However, that @id should be the URI of the Canvas resource, and resolving it should return the JSON given above, rather than the (semantically related) TEI surface.

I think instead the link to the transcription should be encoded as an "other content resource", within the otherContent property at the end of the Canvas JSON (the property value is an empty array in the existing prototype). There's an example in the IIIF Presentation API spec, which actually points to a TEI XML resource, which I think we should follow.

Conal-Tuohy commented 5 years ago

The UV widget is now embedded.

There'll need to be a minor revision once the new image back end becomes available, but that is another issue.