Closed DavidFatDavidF closed 3 years ago
@genivia-inc before I can fix the prose, the schema for original data needs to be fixed. I am afraid that we will have to discuss this first in the May meeting before we will be able to agree a solution. Nevertheless, here is soem guidance what the originalData schema needs to do. originalData is a unit level container for non-translatable strings significant for the original format or its roundtrip. It's a plain wrapper for data lements that have a required NMTOKEN id, so that the original data can be referenced from inline objects such as ph and sc.
Sure. A resolution to this issue can be discussed at the next meeting or anytime here with the TC members.
For the record, here is my email from Oct 12 2016 to the xliff-omos list:
1.The UML diagram appears to lack the originalData table that was included earlier and that is also in the XLIFF XML schema. Is this correct? I wonder how this affects XML to/from JSON conversion tools, because the data content has to be translated in addition to the structural content (XML to/from JSON). This means a semantic translation, not just a syntactic translation back/forth JSON and XML. It would be simpler to keep originalData and startRef/dataRef. Just my 2c on this one. [...] Below is an initial JSON schema of JLIFF followed by an example. This assumes originalData is still a table. It is almost complete, except for adjustments related to the above, possible improvements, and perhaps rules such as oneOf/anyOf etc to describe and restrict the "element" content that combines all attributes of ec, em, pc, ph, sc, and sm.
Sorry for spamming you with this large email. The github repo is not yet up as David had indicated. [...]
[...] "originalData": { "type": "array", "items": { "type": "object", "properties": { "id": { "type": "string" }, "data": { "type" : "string" } }, "required": [ "id", "data" ] } }, [...]
An example, similar to the JSON wiki example:
{ "id": "fl", "version": "1.0", "unit": [ { "id": "u1", "originalData": [ { "id": "d1", "data": "[C1/]" }, { "id": "d2", "data": "[C2]" }, { "id": "d3", "data": "[/C2]" } ], "subunit": [ { "segment": true, "state": "translated", "canResegment": false, "source": [ { "id": "c1", "kind": "ph", "dataRef": "d1" }, { "text": "aaa" }, { "id": "c2", "kind": "pc", "startRef": "d2", "dataRef": "d3", "text": "text" } ], "target": [ { "id": "c1", "kind": "ph", "dataRef": "d1" }, { "text": "AAA" }, { "id": "c2", "kind": "pc", "startRef": "d2", "dataRef": "d3", "text": "TEXT" } ] }, { "segment": false, "source": [ { "text": ". " } ] } ] } ] }
This must have been changed to the current key-string table form of object (patternProperties
) by committee consensus and committed to this repo once this repo was set up.
The proposed resolution is the following, please comments:
"originalData": {
"description": "A collection of key-value pairs with NMTOKEN id addressible from inline and string values",
"type": "object",
"patternProperties": { "^[-._:A-Za-z0-9]+$": { "type": "string" } },
"additionalProperties": false
},
dF to implement the agreed solution in prose by June review
The proposed schema change is committed.
Looking at this in the schema
"originalData": {
"description": "A collection of key-value pairs with NMTOKEN id addressible from inline and string values",
"type": "object",
"patternProperties": { "^[-._:A-Za-z0-9]+$": { "type": "string" } },
"additionalProperties": false
I think this is still wrong, IMHO it should be "type": "array"
with the indicated pattern for each of "items"
and "minItems": 1
originalData should be a unit level collection/array of individually addressable original data strings..
the vocabulary object solution was intended, a corresponding originalDataDir
vocabulary object was created to address directionality
@genivia-inc to implement originalDataDir in the schema
The JLIFF 2.0 and 2.1 schemas are updated with the addition of an originalDataDir
object.
Implemented in prose, to be built and committed today..
close with #47