oasis-tcs / xliff-omos-jliff

OASIS XLIFF OMOS TC: JSON serialization of the XLIFF Abstract Object Model
https://github.com/oasis-tcs/xliff-omos-jliff
Other
17 stars 4 forks source link

Extension mechanim #52

Open DavidFatDavidF opened 2 years ago

DavidFatDavidF commented 2 years ago

Need to adapt XLIFF xml based extension mechanism prose to the userdata based extension mechanism defined for JLIFF

philr-vistatec commented 2 years ago

My implementation doesn't currently handle extensions (other than provide a userdata property on unit, etc.) but I guess it should EITHER

(a) convert any proprietary namespace xml nodes according to a generic algorithm of make any parent xml element a json object and any child xml elements or xml attributes into json properties OR (b) not attempt conversion and just add a placeholder

For example, if there is a proprietary type like a tooltip it might have the following definitions in XLIFF and JLIFF:

XLIFF

<xliff xmlns='urn:oasis:names:tc:xliff:document:2.0' version='2.0'
    srcLang='en' trgLang='fr' xmlns:my='http://example.com/userextension/1.0'>
  <file id='f1'>
    <notes>
      <note id='n1'>note for file.</note>
    </notes>
    <unit id='u1'>
      <my:tooltip>
        <my:forLabel id='firstName'>
          <my:text>Please enter your given first name</my:text>
          <my:vertPosition>above</my:vertPosition>
          <my:horizPosition>center</my:horizPosition>
        </my:forLabel>
      </my:tooltip>
      <notes>
        <note id='n1'>note for unit</note>
      </notes>
      <segment id='s1'>
        <source><pc id='1'>Hello <mrk id='m1' type='term'>World</mrk>!</pc>
            </source>
        <target><pc id='1'>Bonjour le <mrk id='m1' type='term'>Monde</mrk>!</pc>
            </target>
      </segment>
    </unit>
  </file>
</xliff>

JLIFF (a)

{ "jliff": "2.1",
  "@context": {
      "my": "http://example.com/userextension/1.0",
  },
  "srcLang": "en-US",
  "trgLang": "fr-FR",
  "files": [
    { "id": "f1",
      "kind": "file",
      "notes": [
        { "id": "n1",
          "text": "note for file." }
      ],
      "subfiles": [
        { "canResegment": "no",
          "id": "u1",
          "kind": "unit",
          "notes": [
            { "id": "n1",
              "text": "note for unit" },
      "userdata": {
            "my:tooltip": {
            "my:forLabel": {
              "my:id": "firstName",
              "my:text": "Please enter your given first name",
              "my:vertPosition": "above",
              "my:horizPosition": "center"
              }
            }
          }
          ],
          "subunits": [
            { "id": "s1",
              "kind": "segment",
              "source": [
                { "id": "1",
                  "kind": "sc" },
                { "text": "Hello " },
                { "id": "m1",
                  "kind": "sm",
                  "type": "term" },
                { "text": "World" },
                { "kind": "em",
                   "startRef": "m1" },
                { "text": "!" },
                { "kind": "ec",
                  "startRef": "1" }
              ],
              "target": [
                { "id": "1",
                  "kind": "sc" },
                { "text": "Bonjour le " },
                { "id": "m1",
                  "kind": "sm",
                  "type": "term" },
                { "text": "Monde" },
                { "kind": "em",
                  "startRef": "m1" },
                { "text": "!" },
                { "kind": "ec",
                  "startRef": "1" }
              ] }
          ] }
      ] }
  ]
}

JLIFF (b)

{ "jliff": "2.1",
  "@context": {
      "my": "http://example.com/userextension/1.0",
  },
  "srcLang": "en-US",
  "trgLang": "fr-FR",
  "files": [
    { "id": "f1",
      "kind": "file",
      "notes": [
        { "id": "n1",
          "text": "note for file." }
      ],
      "subfiles": [
        { "canResegment": "no",
          "id": "u1",
          "kind": "unit",
          "notes": [
            { "id": "n1",
              "text": "note for unit" },
      "userdata": {
            "my:tooltip": {
              "my:id": "firstName"
            }
          }
          ],
          "subunits": [
            { "id": "s1",
              "kind": "segment",
              "source": [
                { "id": "1",
                  "kind": "sc" },
                { "text": "Hello " },
                { "id": "m1",
                  "kind": "sm",
                  "type": "term" },
                { "text": "World" },
                { "kind": "em",
                   "startRef": "m1" },
                { "text": "!" },
                { "kind": "ec",
                  "startRef": "1" }
              ],
              "target": [
                { "id": "1",
                  "kind": "sc" },
                { "text": "Bonjour le " },
                { "id": "m1",
                  "kind": "sm",
                  "type": "term" },
                { "text": "Monde" },
                { "kind": "em",
                  "startRef": "m1" },
                { "text": "!" },
                { "kind": "ec",
                  "startRef": "1" }
              ] }
          ] }
      ] }
  ]
}
genivia-inc commented 2 years ago

Both JLIFF (a) and (b) serializations look in principle correct to me, assuming that parts of an extension may or may not be supported by a recipient or if no proper JSON serialization is defined. Is that the concern

Serialization (a) matches the XLIFF definition, since this is a generic JSON serialization of <my:tooltip> based on a simple form of mapping. Perhaps the mapping algorithm should be left to the implementors to describe, define and implement.

DavidFatDavidF commented 2 years ago

I agree with @philr-vistatec that roundtripping XLIFF extension data via JLIFF is not trivial. In theory the extension owner should be in control how their XML translates into JLIFF. The JLIIFF extension mechanism only gives two constraints

  1. use the "userdata" wrapper and
  2. the content of the wrapper must be an object

In simple scenarios such as the above my: extension example, algorithmic transposition seems possible but even here the theoretical owner could have chosen different a different JSON structure. I guess we should have a discussion on extensions roundtripping on 21st Dec. The main scenarios seem

  1. roundtripping a known extension
  2. strategies for roundtripping or dropping unknown extensions
genivia-inc commented 2 years ago

@DavidFatDavidF

I agree with @philr-vistatec that roundtripping XLIFF extension data via JLIFF is not trivial. In theory the extension owner should be in control how their XML translates into JLIFF. The JLIIFF extension mechanism only gives two constraints

  1. use the "userdata" wrapper and
  2. the content of the wrapper must be an object

Yes, there is no other choice but to adhere to these two structural constraints (defined in the schema.)

In simple scenarios such as the above my: extension example, algorithmic transposition seems possible but even here the theoretical owner could have chosen different a different JSON structure. I guess we should have a discussion on extensions roundtripping on 21st Dec. The main scenarios seem

  1. roundtripping a known extension
  2. strategies for roundtripping or dropping unknown extensions

I'm surprised if this is not already addressed within the scope of XLIFF. If this is the case, why place these semantic aspect in the scope of JLIFF specifically? If this is not the case, then both XLIFF and JLIFF Specs may leave it open to the implementors to define appropriate semantics and structures for user extensions. The structural constraints are only limited by the well-formed XML and JSON constraints imposed by the corresponding standards on the extension content.

DavidFatDavidF commented 2 years ago

Based on todays discussion we clarified that the "userdata" wrapper is a non-empty vocabulary of objects. One wrapper will serve as a bag for extension data from any and all authorities utilizing the given extension point. Each object within the bag is recognizable as belonging to a specific extension by a specific extension owner authority using the JSON LD defined fully qualified names mechanism, i.e. colon separated extension prefixes, each to be expanded with its globally defined context URI.

DavidFatDavidF commented 2 years ago

We also agreed that extension data roundtrip can only be guaranteed if the extension has both (XML and JSON) schemas defined and published.

genivia-inc commented 2 years ago

I propose the addition of a "mustUnderstand" property to JLIFF (and XLIFF) "userdata" extension objects. The meaning of this new property when present:

when the "mustUnderstand" property is not present, it defaults to "mustUnderstand": false.

philr-vistatec commented 2 years ago

Leaving this open for the moment until desired behavior is documented within the specification.