computerline1z / okapi

Automatically exported from code.google.com/p/okapi
0 stars 0 forks source link

ITS standoff annotations should be inside file #363

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Currently the ITS standoff annotations for LQI and provenance are placed at the 
end of the document, just before </xliff>

This cause issues when the document is split per <file> (e.g. with the XLIFF 
Splitter step): the notations do not follow.

There is also an issue in the 1.2 specification and schema where they 
contradict each other as where the extended elements can be within the <xliff> 
element. See: https://lists.oasis-open.org/archives/xliff/201308/msg00062.html 
for details.
Because the schema wins the ITS entries should really be placed before the 
first <file> if they are to stay outside <file>.

Original issue reported on code.google.com by yves.sav...@gmail.com on 22 Aug 2013 at 5:42

GoogleCodeExporter commented 9 years ago
CC'ing Kevin.  We talked about this a bunch when he was working on the code, 
and there are some ugly cases.  XLIFF Splitters (which are common) are unlikely 
to be ITS-aware, since most of them are extremely simple.  This means if we 
want to support that use case, we will need to make sure that each <file> 
contains all the relevant ITS metadata for its own contents.  This is a 
disconnect from the way ITS standoff can be referenced by multiple locations in 
the file.  To fully support XLIFF splitting, it is therefore sometimes 
necessary to rewrite existing ITS standoff so that there is one copy for each 
referencing location.  (And then moving that standoff inside the relevant 
<file>.)

When we were initially working on it, we didn't want to touch any of that 
stuff, so we went the simple route at the risk of breaking the XLIFF splitter.

But it looks like you're right about the schema; we would need to move it to be 
before the first <file>, at least.

Original comment by tingley on 22 Aug 2013 at 5:27

GoogleCodeExporter commented 9 years ago
The reason why the ITS entries were placed at the end of the file was the list 
of ITS standoff to be written out is built during the processing of the 
document by the XLIFFSkeletonWriter. So the standoff isn't known until the end 
of the document, which must be since we need to resolve the ITS references on 
the elements in the XLIFF file and detect any duplicate references in the 
standoff.

We may need to parse the document twice and pass in the ITS standoff to be 
written in the processStartDocument method. I'm worried this would enforce 
unusual usage behavior for the XLIFFSkeletonWriter, but I guess it would be 
only for handling ITS metadata.

Original comment by ke...@spartansoftwareinc.com on 22 Aug 2013 at 7:05

GoogleCodeExporter commented 9 years ago
Maybe that is not worth the trouble.
The issue occurs only because the schema doesn't match the spec. So it's only 
when one want to validate the XLIFF document.

I'll put this has a low-priority issue.

Original comment by yves.sav...@gmail.com on 22 Aug 2013 at 7:21

GoogleCodeExporter commented 9 years ago
See also Issue 396 - doing it per-file is not valid XLIFF.

Original comment by tingley on 7 Apr 2014 at 11:58