The idea is to use CollateX to generate the tables.
CollateX takes as input a JSON file with a list of witnesses. Each witness is an array of tokens, each of which contains a text and optionally a normalized version. Each token can also have an arbitrary number of other attributes which are passed transparently to the output.
CollateX can generate a collation table or a graph. The collation table is a matrix of witness x token.
The process should go like this:
For each witness:
-- Get an ordered list of items
-- Transform each item into a list of tokens --> this is where the complexity lies: words that go over more than one item (e.g., nowb marks), words with multiple versions (sic, abbr, corr), combinations of the above.
Transform the lists of tokens into a CollateX input JSON file
The idea is to use CollateX to generate the tables.
CollateX takes as input a JSON file with a list of witnesses. Each witness is an array of tokens, each of which contains a text and optionally a normalized version. Each token can also have an arbitrary number of other attributes which are passed transparently to the output.
CollateX can generate a collation table or a graph. The collation table is a matrix of witness x token.
The process should go like this: