rmlockwood / FLExTrans

Machine Translation using FLEx, Apertium, and STAMP
MIT License
10 stars 2 forks source link

[Synthesis] "Is abstract form" doesn't prevent an allomorph from being chosen #507

Closed bbryson closed 10 months ago

bbryson commented 11 months ago

I added an allomorph because I know it occurs. But I'm not ready to specify its environment, so I made it "Is abstract form". However, it is still getting chosen as the "elsewhere allomorph" (because it has no environment).

Allomorph definition:

image

Word that should get i- prefix (Lexeme Form):

image
rmlockwood commented 10 months ago

Also entries marked as Is Abstract Form should get ignored when creating the bilingual lexicon. Also, maybe there should be a check on the interlinear text that no morpheme used is an abstract form. This would produce an error.

bbryson commented 10 months ago

But "Is Abstract Form" only applies to allomorphs (including the Lexeme Form)--it isn't normally considered to apply to an entire entry. If an entry consists only of allomorphs that are marked as Abstract (including the case where there is only a Lexeme Form and that form is marked as abstract), then it is not clear to me what that entry is for. The parser would not ever recognize it as part of a word in the interlinear. Manual interlinearizing would allow it though. We would need to understand more about what this kind of entry is trying to do, to understand how to treat it for FLExTrans.

rmlockwood commented 10 months ago

On second thought, I'm not going to ignore lexeme forms that are marked as abstract since other allomorphs could be valid. We could at some point check if all forms are marked as abstract and ignore them. This is now fixed for synthesis and will be in the next version.