giellalt / template-lang-und

A template repo for new languages, as well as to update existing language repos with.
https://giellalt.uit.no/
GNU Lesser General Public License v3.0
2 stars 1 forks source link

lang-sms norm analyzer and generator do not normalize dict/pedagogical spelling #23

Open rueter opened 5 months ago

rueter commented 5 months ago

The generator-gt-norm.hfstol produces the same extra letters as are found in generator-dict-gt-norm.hfst, i.e., the modifier letter vertical line removed by (src/fst/filters/remove-modifier-letter-vertical-line.regex) and the e with dot below modified by (remove-letter-dot-below.regex) to e, are not applied to the construction of the analyzer either.

Screenshot 2024-01-22 at 13 20 16
snomos commented 5 months ago

This is a regression after the move, cf #8, thus reported here.

flammie commented 5 months ago

try now. There's updates both in giella-core and lang-sms

rueter commented 5 months ago

Working fine! It would be nice, however, to generate the analyser-dict-gt-norm.hfstol for work with developing the Skolt description, now that we know it can be done :-).