apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

Add the possibility of defining morpheme boundaries #91

Closed ftyers closed 4 years ago

ftyers commented 4 years ago

The compiler and expander now allow <m/> inside <l></l> sides to represent morpheme boundaries. These are not compiled in, or printed by default, but can be with the -m or --keep-boundaries option. This closes #89.

ftyers commented 4 years ago

lt-comp,

$ ./lttoolbox/lt-comp -m lr ./tests/data/morpheme-boundaries.dix morpheme-boundaries.bin
main@standard 26 34
 ./lttoolbox/lt-print morpheme-boundaries.bin
0   1   c   c   0.000000    
0   2   a   a   0.000000    
0   3   w   w   0.000000    
0   4   r   r   0.000000    
0   4   b   b   0.000000    
1   5   a   a   0.000000    
1   6   h   h   0.000000    
2   7   x   x   0.000000    
3   8   o   o   0.000000    
4   5   a   a   0.000000    
5   9   t   t   0.000000    
6   10  u   u   0.000000    
7   11  >   i   0.000000    
8   12  l   l   0.000000    
9   13  ε   <n> 0.000000    
9   14  >   <n> 0.000000    
10  15  r   r   0.000000    
11  16  i   s   0.000000    
11  17  e   s   0.000000    
12  18  f   f   0.000000    
12  19  v   f   0.000000    
13  20  ε   <sg>    0.000000    
14  20  s   <pl>    0.000000    
15  21  c   c   0.000000    
16  13  s   <n> 0.000000    
17  22  s   <n> 0.000000    
18  13  ε   <n> 0.000000    
19  23  >   <n> 0.000000    
21  24  h   h   0.000000    
22  20  ε   <pl>    0.000000    
23  25  e   <pl>    0.000000    
24  13  ε   <n> 0.000000    
24  23  >   <n> 0.000000    
25  20  s   ε   0.000000    
20  0.000000

And lt-expand,

$ ./lttoolbox/lt-expand ./tests/data/morpheme-boundaries.dix 
cat:cat<n><sg>
cats:cat<n><pl>
wolf:wolf<n><sg>
wolves:wolf<n><pl>
church:church<n><sg>
churches:church<n><pl>
bat:bat<n><sg>
bats:bat<n><pl>
rat:rat<n><sg>
rats:rat<n><pl>
axis:axis<n><sg>
axes:axis<n><pl>
$ ./lttoolbox/lt-expand -m ./tests/data/morpheme-boundaries.dix 
cat:cat<n><sg>
cat>s:cat<n><pl>
wolf:wolf<n><sg>
wolv>es:wolf<n><pl>
church:church<n><sg>
church>es:church<n><pl>
bat:bat<n><sg>
bat>s:bat<n><pl>
rat:rat<n><sg>
rat>s:rat<n><pl>
ax>is:axis<n><sg>
ax>es:axis<n><pl>