Open ghedlund opened 2 years ago
Because '~' and '+' are related conceptually in CHAT, '+' will also be a morpheme boundary marker.
The following is an example which shows values from three tiers (identified by letter) and the corresponding alignments. Tokens are in the form A1, B1, etc., but would be the values from our dictionary in practice. '~' indicates a morpheme boundary within a word; spaces are word boundaries; and '*' is used a word placeholder within the tier text.
Tier 1: A1 A2~A3 A4~A5~A6
Tier 2: B1 B2 B3~*~B4
Tier 3: * *~C1 C2~C3~C4
Alignments:
1) A1, B1, *
2) A2, B2, *
3) A3, *, C1
4) A4, B3, C2
5) A5, *, C3
6) A6, B4, C4
Add aligned morpheme API in parallel with AlignedWord API which uses '~' and ' ' (space) as the morpheme boundary marker.