Closed GoogleCodeExporter closed 9 years ago
it's not a problem to implement it. But my vision is to stay as transparent as
possible: all XML entities (&,<,...) are encoded like the example above (<
&,...)
Original comment by xhu...@gmail.com
on 3 Mar 2011 at 9:17
Original comment by Achi...@gmail.com
on 3 Mar 2011 at 12:34
If you are going with this logic you would also have to escape ' and "
characters:
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Pr
edefined_entities_in_XML
But this would directly affect the decoding (e.g. "Don 't").
So I propose to only escape characters that cause problems in the Moses decoder
(<,>,[,],|). To clarify: the <,> around the inline elements should stay, as
they get removed before the content is run through the decoder.
Original comment by Achi...@gmail.com
on 3 Mar 2011 at 3:51
done in r.62
line 172
Original comment by xhu...@gmail.com
on 11 Mar 2011 at 10:01
Emits warning now:
WARNING: incorrectly created original XLIFF. String: "& Check" should be
wrapped in special tags.
Doesn't seem to be necessary for this case, only escaped XML/HTML-style tags.
Original comment by Achi...@gmail.com
on 14 Mar 2011 at 2:18
Original issue reported on code.google.com by
Achi...@gmail.com
on 3 Mar 2011 at 1:45