Open zrailing opened 5 years ago
Another option is to use a "diacritic" symbol on the morphotactic side, e.g.
dáschii:dáschii%{¬%} N ;
bítchii:bítchii%{¬%} N ;
chií:chií N ;
isshíi:isshíi N ;
Then have rules something like
"Stem ablaut in long vowels (1)"
Vx:Vy <=> _ Vx:Vy %{¬%}: %>: ;
where Vx in ( i e e )
Vy in ( a i a )
matched ;
"Stem ablaut in long vowels (2)"
Vx:Vy <=> Vx:Vy _ %{¬%}: %>: ;
where Vx in ( i e e )
Vy in ( a i a )
matched ;
To process the pair string:
d á s c h i i {¬} > u
d á s t 0 a a 0 0 u
Of course the ambiguity between e → i and e → a is a problem, but that could be solved either by having a separate symbol, or by having a specific archiphoneme for the minority class of the two.
Stem Ablaut (Graczyk 2.5.10)
This appears to affect both nouns and verbs:
However there are some that are not affected:
The fact that Graczyk specifically mentions marking them differently in the lexicon seems to suggest pretty clearly that they should be in separate lexicons within the analyzer. That being said should all of the stem ablaut types be lumped into the same lexicon (the three types being ii --> aa , ee --> ii , and ee --> aa ) or should they be split into 3? Further, should they be split according to ablaut and POS or not?