apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

add fst-expander and paradigm generator #153

Closed mr-martian closed 2 years ago

mr-martian commented 2 years ago

This PR adds lt-paradigm which is similar to lt-expand and hfst-expand except that instead of listing all forms in the transducer it only lists those that match an input pattern.

$ echo 'sing<vblex><*>' | lttoolbox/lt-paradigm ../apertium-data/apertium-eng/eng.autogen.bin 
sing<vblex><inf>:sing
sing<vblex><pres>:sing
sing<vblex><imp>:sing
sing<vblex><pprs>:singing
sing<vblex><ger>:singing
sing<vblex><subs>:singing
sing<vblex><pres><p3><sg>:sings
sing<vblex><pp>:sung
sing<vblex><past>:sang

For each line of the input, * is replaced with every letter in the alphabet and <*> with every tag and the result is intersected with the fst and the results printed line-by-line.

This is nearly equivalent to hfst-regexp2fst | hfst-compose -1 - -2 $1 | hfst-expand -c 0 but with different tokenization of the input and there can be multiple lines of input.

unhammer commented 2 years ago

NICE!