apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

form is empty if followed by space and soft hyphen #50

Closed unhammer closed 2 years ago

unhammer commented 5 years ago

Unzip softhyph.zip

The form i is missing below:

$ lt-proc -we nob-nno.automorf.bin < softhyph
^/i<pr>/ialphabet<n><m><sg><ind>$ ­^xyzzy/*xyzzy$

The third character in the input file is a soft hyphen (utf8 bytes C2AD):

$ hexdump -C softhyph
00000000  69 20 c2 ad 78 79 7a 7a  79 0a                    |i ..xyzzy.|
0000000a

Remove the soft hyphen, and it gives the expected

^i/i<pr>/ialphabet<n><m><sg><ind>$ ^xyzzy/*xyzzy$
unhammer commented 2 years ago

fixed in https://github.com/apertium/lttoolbox/commit/848abfe5b0a6ff88e0f46fb5a4d0fc8d26ba5628