peritext / peritext-old

[Deprecated] a contextualization-oriented multimodal publishing engine
Other
0 stars 1 forks source link

[WIP] Peritext - a contextualization-oriented academic publishing engine

Build Status Documentation coverage Coverage Status

Peritext is a javascript/node library aimed at facilitating the making of media-rich, data-driven and multimodal academic publication projects.

!!!! Caution : Work In Progress !!!

Peritext is in its really early phase of existence : some modules APIs should change a lot in the near future, test coverage is poor, contextualization components are minimal, and exporters too. Use it at your own risk, but more than that, feel free to give a hand to the project !

Peritext header

The RCC model

The core of the library is centered on the manipulation of a specific data abstraction of multimodal scholarly documents, which is labeled the Resource-Contexutalization-Contextualizer Model.

The Resource-Contextualization-Contextualizer Model is a way to model an academic document as an entity composed of contents, resources, contextualizations and contextualizers:

Peritext resource model in one image

Why is this model useful ?

What use cases is this model useful for ?


The library

To allow an authoring process based on the RCC Model, Peritext library is a set of modules whose core is centered on converting the javascript representation of flatfile-structured contents to the javascript representation of a RCC document, and vice versa.

Peritext set of modules handles conversions between several representations of a document that could be represented as following :

Peritext resource model in one image

Each conversion step correspond to specific peritext modules. Let's present them quickly.

(filesystem data<-->filesystem representation) Peritext connectors

Peritext contents are assumed to be written in plain text files (in markdown for narratives and bibtex for resources description) and hosted in flatfile-represented data sources (ftp server, local hard drive/server files, google drive, amazon s3, ...).

The relation to these datasources is handled via connectors plugins that provide a consistant API for transactions with these sources. Connectors are the native entrypoint and exitpoint of working with peritext documents.

The choice of flat-file representation for the data source is motivated by the desire to provide a very flexible and light way to produce academic documents (as opposed to "big platforms"), but it could be as well connected to a more traditionnal database by not using peritext connectors (WIP).

(filesystem representation<-->abstract rcc document representation) Peritext core

Peritext core modules are about converting a representation of flat-file contents to a RCC document representation.

The RCC document representation js object looks like that :

{
  'forewords': {/*...*/}, // special section for forewords/frontpage of document - see sections data below
  'sections': { // sections composing the document - they are presented in a flat organization, but can specify sequentiality/hierarchy (subparts, ...) indication in their metadata
    'section1': {
      'metadata': { // section metadata in several domains
        'general': {/*...*/},
        'twitter': {/*...*/},
        'dublincore': {/*...*/},
        /*...*/
      }
      'contents': [ // javascript representation of pseudo-DOM tree of contents core
        {
          'type': 'element',
          'tag': 'p',
          'child' : [/*...*/]
        },
        /*...*/
      ]
      'notes': [ // javascript representation of pseudo-DOM tree of contents notes
        {/*...*/},
        /*...*/
      ]
      'contextualizations': ['cont1', 'cont2'], // list of involved contextualizations (sugar)
      'customizers' : {/*...*/} // section-specific css stylesheets
    },
    'section2': {/*...*/}
  },

  'summary': ['section1', 'section2'], // linear order of sections
  'resources': { // resources involved in the whole document
    'resource1': {
      'bibType': 'book',
      'id': 'resource1',
      /*...*/
    },
    /*...*/
  },
  'contextualizations': { // contextualizations involved in the whole document
    'cont1': {
      'bibType': 'contextualization',
      'id': 'cont1',
      'resources': ['resource1'],
      /*...*/
    }
    'cont2': {/*...*/}
  }
  'contextualizers': { // contextualizers involved in the whole document
    'cont1': {
      /*...*/
    },
    'explicit-contexutalizer': {
      /*...*/
    }
  } 
}

Note that contents are represented as a pseudo-DOM javascript representation, which is very similar to the output of html2json library (used as a basis in the process of conversion).

(abstract rcc document representation <-> output-specific rcc document representation) peritext contextualizers

Peritext handles the conversion of an abstract RCC representation to output-specific document representations. Output is defined by a type of output (either print-like or web-like) and a set of organization-related rendering parameters (such as: where to put the notes ? at what level to compute bibliography (whole document/chapters) ? ...).

This step is done through contextualizers plugins that transform the pseudo-dom representation of sections' contents to pseudo-dom output-specific representations. For each contextualization in the document, the related contextualizer plugin is called : it takes the previous document and returns an updated document in which the contextualization has been resolved.

For instance, for the webpage contextualizer plugin, inputting a document which contains a website contextualization will result :

Please note that at this point, document representation is still a plain javascript object representation but it is no more serializable as React components are stored along the tree.

(output-specific rcc document representation <-> outputs) renderers, exporters and lib. getters/setters

Contextualizers plugins also provide react components to compose static or dynamic html representations. They can be used in an app, or they can be used by renderers plugins that produce usable representations of contents for outputs (e.g. static html). Contextualizers components can be customized or overriden by specific applications.

Ideally the library thrives to eventually support an ecosystem approach of contextualizers, but all contextualizers should handle both static (e.g. print) outputs and dynamic (e.g. web) outputs.

Eventually, Peritext flexibly handles outputing academic documents to real outputs.

exporters take as argument a representation served by renderers (e.g. static html) and output a file (pdf, xml, html, ...).

For uses of Peritext as a library in web applications, it finally provides a set of getters and setters functions that facilitate working with RCC document representation objects in applications.