Added transcription helpers for extracting text from a canvas

stephenwf commented 4 months ago

Transcription helper.

Will find the following transcriptions:

VTT as rendering on canvas
Embedded Annotation page
External Annotation page
ALTO annotations (FUTURE)

Cookbook:

Plaintext rendering on canvas:

"rendering": [
  {
    "id": "https://fixtures.iiif.io/video/indiana/volleyball/volleyball.txt",
    "type": "Text",
    "label": {
      "en": [
        "Transcript"
      ]
    },
    "format": "text/plain"
  }
]

VTT annotation body on AV canvases:

"annotations": [
  {
    "id": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas/page2",
    "type": "AnnotationPage",
    "items": [
      {
        "id": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas/page2/a1",
        "type": "Annotation",
        "motivation": "supplementing",
        "body": {
          "id": "https://fixtures.iiif.io/video/indiana/lunchroom_manners/lunchroom_manners.vtt",
          "type": "Text",
          "format": "text/vtt",
          "label": {
            "en": [
              "Captions in WebVTT format"
            ]
          },
          "language": "en"
        },
        "target": "https://iiif.io/api/cookbook/recipe/0219-using-caption-file/canvas"
      }
    ]
  }
]

OCR annotations:

a motivation of supplementing,
the URI of the OCR file in the id property of the Annotation body, and

the target set to the applicable Canvas.

{
"id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-anno_p1.json-1",
"type": "Annotation",
"motivation": "supplementing",
"body": {
"type": "TextualBody",
"format": "text/plain",
"language": "de",
"value": "I. 54. Jahrgang"
},
"target": {
"type": "SpecificResource",
"source": {
  "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/canvas/p1",
  "type": "Canvas",
  "partOf": [
    {
      "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json",
      "type": "Manifest"
    }
  ]
},
"selector": {
  "type": "FragmentSelector",
  "conformsTo": "http://www.w3.org/TR/media-frags/",
  "value": "xywh=0,376,399,53"
}
}
}

OR Linking Directly to an ALTO File. (FUTURE, NOT IMPLEMENTED)

"rendering": [
  {
    "id": "https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-alto_p2.xml",
    "type": "Text",
    "format": "application/xml",
    "profile": "http://www.loc.gov/standards/alto/",
    "label": {
      "en": [
        "ALTO XML"
      ]
    }
  }
],

It will produce a standard format for both temporal and plaintext/positional plaintext, including selectors.

interface Transcription {
  id: string;
  source: any;
  plaintext: string;
  segments: Array<{
    text: string;
    textRaw: string;
    granularity?: 'word' | 'line' | 'paragraph' | 'block' | 'page';
    language?: string;
    selector?: ParsedSelector;
    startRaw?: string;
    endRaw?: string;
  }>;
}

ParsedSelector include spatial and temporal information. Either from an annotation or from VTT (very simple parsing at the moment - external libraries for it are heavy). If there is just plaintext by itself, then there are no segments.

A viewer could start with just showing the plaintext, and then implement optional segments later.

Some new helpers too:

canvasHasTranscriptionSync() - checks if there is a transcription on a canvas without making any network requests
canvasLoadExternalAnnotationPages() loads and waits for external Annotation Pages
annotationPageToTranscription() - actual code for fetching the transcription - will also fetch all annotation pages. Recommended to use with Vault (to avoid multiple requests).

codesandbox-ci[bot] commented 4 months ago

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

stephenwf commented 4 months ago

At the moment, we are losing track of the Annotation target when parsing. It will very likely be the Canvas, but it could be

Canvas ID
Media id (complex timeline)
Choice ID (indicating it works with all choices)

And clients might need to check when they are providing navigation using the selector that it's got the right target.

stephenwf commented 4 months ago

Also need to pass in a language, so that the transcription can check for choices structured like this: https://iiif.io/api/cookbook/recipe/0074-multiple-language-captions/

stephenwf commented 3 months ago

This still needs more testing, will leave open.

IIIF-Commons / iiif-helpers

Added transcription helpers for extracting text from a canvas #15