UAlbertaALTLab / morphodict

Plains Cree Intelligent Dictionary
https://itwewina.altlab.app/
Apache License 2.0
22 stars 11 forks source link

Morpheme boundaries missing when cell has multiple forms #1071

Closed aarppe closed 2 years ago

aarppe commented 2 years ago

For instance in the following case, when the cell has two possible wordforms, the morpheme boundaries do not get shown (even though the FST does generate them).

image
hfst-lookup -q src/crk-strict-generator-with-morpheme-boundaries-giellaltbuild.hfst
wâpahtam+V+TI+Ind+12Pl
wâpahtam+V+TI+Ind+12Pl  ki<wâpahtê>naw  0.000000
wâpahtam+V+TI+Ind+12Pl  ki<wâpahtê>nânaw    0.000000
nienna73 commented 2 years ago

I found the cause of this issue. When we give the morpheme boundary FST an analysis that returns two results, we have no way of knowing which result needs to be returned or has already been returned. So the first time it receives the analysis wâpahtam+V+TI+Ind+12Pl, it should return ki<wâpahtê>naw, and the second time it should return ki<wâpahtê>nânaw. We have no way of knowing if this is the first, second, fifth, etc. time we're receiving this analysis, though.

I'm working on a way around this issue.