apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

lt-tmxproc doesn't keep escaped input chars escaped in output #190

Open unhammer opened 4 weeks ago

unhammer commented 4 weeks ago
$ echo "3 > 2"  | apertium -f txt -d . nob-nno
3 > 2

$ echo "1 > 2"  | apertium -f txt -d . -m word.tmx  -o nob-nno nob-nno
Error: Malformed input stream.

$ echo "how about 1 < 2 then?"  | apertium -f txt -d . -m word.tmx  -o nob-nno nob-nno
*how *about 1

# possible workarounds:
$ echo "1 < 2"  | apertium -f html -d . -m word.tmx  -o nob-nno nob-nno
1 &lt; 2

$ echo "1 < 2"  | apertium -f html-noent -d . -m word.tmx  -o nob-nno nob-nno
1 &lt; 2

$ echo "1 < 2"  | apertium -f html-alt -d . -m word.tmx  -o nob-nno nob-nno
1 < 2

word.tmx is irrelevant:

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE tmx SYSTEM "tmx14.dtd">
<tmx version="1.4">
  <header creationtool="Translate Toolkit - po2tmx" creationtoolversion="1.9.0"
          segtype="sentence" o-tmf="UTF-8" adminlang="nob" srclang="nno" datatype="PlainText"/>
  <body>
    <tu>
      <tuv xml:lang="nob">
        <seg>foo</seg>
      </tuv>
      <tuv xml:lang="nno">
        <seg>bar</seg>
      </tuv>
    </tu>
  </body>
</tmx>
unhammer commented 3 weeks ago

$ printf '1 \< 2.[][\n]'| ~/PREFIX/lttoolbox/bin/lt-tmxproc -s -z tiny.tmx.bin
1 < 2.[][
]