apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

Support for `ANY_CHAR` in regular dix files? #163

Open unhammer opened 2 years ago

unhammer commented 2 years ago

Then we could have bidix entries like <e><i><w/><s n="guess"/></i></e>.

This would require changes to the compiler to allow it, and presumably trim and compose, perhaps also fst_processor?

mr-martian commented 2 years ago

Trim already supports it, compose uses the same equality check as trim, so it should support it too. The compiler just needs lines 420 and 424 changed so it's not checking for is_separable, so it's really just fst_processor that needs non-trivial work.