dbmdz / mirador-textoverlay

Text Overlay plugin for Mirador 3
https://mirador-textoverlay.netlify.com/
MIT License
50 stars 14 forks source link
alto alto-xml hocr iiif mirador mirador-3 mirador-plugins ocr optical-character-recognition

mirador-textoverlay

npm package required Mirador version

A Mirador 3 plugin to display a selectable text overlay based on OCR or transcriptions.

Screenshot Demo on https://mirador-textoverlay.netlify.com (try selecting some text)

Requirements for supported IIIF manifests

For a list of example manifests that are supported, refer to the catalog entry in the demo instance configuration. If you need support for your particular flavor of attaching text to a IIIF canvas, open an issue :-)

Installation

Currently the plugin can only be used if you build your own Mirador JavaScript bundle. To include the plugin in your Mirador installation, you need to install it from npm with npm install mirador-textoverlay, import it into your project and pass it to Mirador when you instantiate the viewer:

import Mirador from 'mirador/dist/es/src/index';
import textOverlayPlugin from 'mirador-textoverlay/es';

const miradorConfig = {
  // Your Mirador configuration
}
Mirador.viewer(config, [...textOverlayPlugin]);

Configuration

You can configure the plugin globally for all windows and/or individually for every window.

For global configuration add the textOverlay entry to the top-level window configuration (globally for all windows) or to the individual window object:

const miradorConfig = {
  window: {
    // ....
    textOverlay: {
      // Global options for all windows, see available settings below
    },
  },
  windows: [{
    // ....
    textOverlay: {
      // Options for an individual window, see available settings below
    },
  }, // ...
}

You can view an example configuration in demo/src/index.js.

The available configuration options (all of which define defaults that can be changed through the UI, except for enabled and fontFamily) are:

The plugin also supports theming for a few things, these can be set under the textOverlay section for the light and/or dark theme (see Mirador 3 Theming on how to set these values):

How it works

The OCR or annotations boxes are rendered page-by-page and word-by-word into SVG images that have the same dimensions as the page it annotates. The position of these page SVGs is then synchronized to the Mirador viewport with dynamic CSS transformations. The implementation of the rendering itself is pretty straight-forward and can probably be adapted to most "deep zoom" viewers without a lot of additional effort. If you need the OCR parsing code as a separate package that you can base an implementation for your favorite viewer on, please open an issue :-)

Contributing

Found a bug? The plugin is not working with your manifest? Want a new feature? Create an issue, or if you want to take a shot at fixing it yourself, make a fork, create a pull request, we're always open to contributions :-)

For larger changes/features, it's usually wise to open an issue before starting the work, so we can discuss if it's a fit.