cwrc / CWRC-WriterBase

The base class from which to create a CWRC-Writer XML editor.
GNU General Public License v2.0
14 stars 3 forks source link

CWRC-Writer-Base

================

Picture

Travis Codecov version downloads GPL-2.0 semantic-release Commitizen friendly experimental

The Canadian Writing Research Collaboratory (CWRC) has developed an in-browser text markup editor (CWRC-Writer) for use by individual scholars and collaborative scholarly editing projects that require a light-weight online editing environment. This package is the base code that builds on the TinyMCE editor, and is meant to be bundled together with two other packages that provide document storage and entity lookup. A default version of the CWRC-Writer that uses GitHub for storage is available for anyone's use at https://cwrc-writer.cwrc.ca/.

Table of Contents

  1. Overview
  2. Storage and Entity Lookup
  3. API
  4. Managers
  5. Modules
  6. Development

Overview

CWRC-Writer is a WYSIWYG text editor for in-browser XML editing and stand-off RDF annotation.
It is built around a heavily customized version of the TinyMCE editor, and includes a CWRC-hosted XML validation service.

A CWRC-Writer installation is a bundling of the main CWRC-WriterBase (the code in this repository) with
a few other NPM packages that handle interaction with server-side services for document storage and named entity lookup.

The default implementation of the CWRC-Writer is the CWRC-GitWriter. It uses GitHub to store documents via the cwrc-git-dialogs package. Entity lookups for VIAF, WikiData, DBpedia, Getty and GeoNames are provided via CWRC-PublicEntityDialogs and related lookup packages.

Storage and Entity Lookup

If you choose not to use either the default GitHub storage or named entity lookups, then most of the work in setting up CWRC-Writer for your project will be in implementing the dialogs to interact with your backend storage and/or named entity lookups. We have split these pieces off into their own packages in large part to make it easier to substitute your own dialogs and supporting services.

A good example to follow when creating a new CWRC-Writer project is our public implementation CWRC-GitWriter. You might also choose to use either the GitHub storage dialogs or the named entity lookups, both of which are used by the CWRC-GitWriter, and replace just one of the two. To help understand how we've developed the CWRC-Writer, you could also look at our development docs.

To replace either of the storage and entity dialogs, you'll need to create modules with the following APIs:

Storage Object API

To see the methods that need to be provided by your own storage implementation, you can view the cwrc-git-dialogs API.

Note that because the load(writer) and save(writer) methods are passed an instance of the CWRC-WriterBase, all of the methods defined below in the API are available, in order to allow getting and setting of XML in the editor.

Entity Lookup API

You have at least two choices here:

  1. You can implement your own dialog for entity lookup, following the model in CWRC-PublicEntityDialogs

  2. You can use CWRC-PublicEntityDialogs and configure it with different sources. We provide five sources: VIAF, Wikidata, Getty, DBpedia, and GeoNames.

You can use any of these sources, and supplement them with your own sources. CWRC-PublicEntityDialogs fully explains how to add your own sources.

API

Constructor

The CWRC-WriterBase exports a single constructor function that takes one argument, a configuration object.

See CWRC-GitWriter/src/js/config.js for an example of a base configuration file, and
CWRC-GitWriter/src/js/app.js to see the configuration file loaded, extended, and passed into the constructor.

Configuration Object

Options that can be set on the configuration object:

Required Options

Other Options

Writer object

The object returned by the constructor is defined here: writer.js. The typical properties and methods you'd want to use when implementing your own storage and/or entity dialogs are:

Properties

isInitialized

boolean
Has the editor been initialized.

isDocLoaded

boolean
Is there a document loaded in the editor.

isReadOnly

boolean
Is the editor in readonly mode.

isAnnotator

boolean
Is the editor in annotate (entities) only mode.

Methods

loadDocumentURL(docUrl)

Loads an XML document from a URL into the editor.

loadDocumentXML(docXml)

Loads an XML document (either a XML Document or a stringified version of such) into the editor.

setDocument(docUrl|docXml)

A convenience method which calls either loadDocumentURL or loadDocumentXML based on the parameter provided.

getDocument(asString)

Returns the parsed XML document from the editor. If asString is true, then a stringified version of the document is returned.

getDocRawContent()

Returns the raw, un-parsed HTML content from the editor.

showLoadDialog()

Convenience method to call the load method of the object set in the storageDialogs property of the config object passed to the writer.

showSaveDialog()

Convenience method to call the save method of the object set in the storageDialogs property of the config object passed to the writer.

validate(callback)

Validates the current document

callback(w, valid): function where w is the writer and valid is true/false. Fires a documentValidated event if validation is successful.

Managers

Tasks within CWRC-Writer are handled by specific managers.

AnnotationsManager

Handles conversion of entities to annotations and vice-versa.

SchemaManager

Handles schema loading and schema CSS processing. Stores the list of available schemas, as well as the current schema. Handles the creation of schema-appropriate entities, via the Mapper.

EntitiesManager

Handles the creation and modification of entities. Stores the list of entities in the current document.

EventManager

Handles the dissemination of events through the CWRC-Writer using a publication-subscribe pattern. See the code for the full list of events.

LayoutManager

Handles the initialization and display of the modules specified in the modules property of the config object. Also handles browser resizing and fullscreen functionality.

DialogManager

Handles the initialization and display of dialogs.

Modules

Modules are self-contained components that add extra functionality to CWRC-Writer. These can be specified in the configuration object using the proper module ID.

StructureTree

Module ID: structure

Displays the markup of the current document in a tree format. Useful for navigating and modifying the document.

EntitiesList

Module ID: entities

Displays the list of entities in the current document. Allows for modifying, copying, scraping, and deleting of entities.

Selection

Module ID: selection

Displays the markup of the text that's selected in the current document.

Validation

Module ID: validation

Configuration:

Requests and displays the results of document validation. See validate.

NERVE

Module ID: nerve

Configuration:

Sends the document for named entity recognition and adds the results as entities to the document.

ImageViewer

Module ID: imageViewer

Displays images linked from within the current document. Useful for OCR'd documents.

Relations

Module ID: relations

Displays the list of entity relationships (i.e. RDF triples) in the current document. Uses triple to add new relationships.

Development

CWRC-Writer-Dev-Docs explains how to work with CWRC-Writer GitHub repositories, including this one.