zrailing / cro-morph

Morphological analyzer built for Crow based on HFST.
0 stars 0 forks source link

Stem Ablaut #3

Open zrailing opened 5 years ago

zrailing commented 5 years ago

Stem Ablaut (Graczyk 2.5.10)

There is a lexically conditional alternation that affects stem-final long vowels that I term "stem ablaut". This alternation is triggered by the plural morpheme, the imperative, and a-initial suffixes. Since the alternation is lexically conditioned - there are a number of stems ending in long vowels that do not undergo ablaut - the stems that do ablaut must be marked as such in the lexicon.

This appears to affect both nouns and verbs:

However there are some that are not affected:

The fact that Graczyk specifically mentions marking them differently in the lexicon seems to suggest pretty clearly that they should be in separate lexicons within the analyzer. That being said should all of the stem ablaut types be lumped into the same lexicon (the three types being ii --> aa , ee --> ii , and ee --> aa ) or should they be split into 3? Further, should they be split according to ablaut and POS or not?

ftyers commented 5 years ago

Another option is to use a "diacritic" symbol on the morphotactic side, e.g.

dáschii:dáschii%{¬%} N ;
bítchii:bítchii%{¬%} N ; 
chií:chií N ;
isshíi:isshíi N ; 

Then have rules something like

"Stem ablaut in long vowels (1)"
Vx:Vy <=> _ Vx:Vy %{¬%}: %>: ;
   where Vx in ( i e e ) 
              Vy in ( a i a )  
   matched ; 

"Stem ablaut in long vowels (2)"
Vx:Vy <=> Vx:Vy _ %{¬%}: %>: ;
   where Vx in ( i e e ) 
              Vy in ( a i a )  
   matched ; 

To process the pair string:

d á s c h i i {¬} > u
d á s t 0 a a  0  0 u

Of course the ambiguity between e → i and e → a is a problem, but that could be solved either by having a separate symbol, or by having a specific archiphoneme for the minority class of the two.