rmlockwood / FLExTrans

Machine Translation using FLEx, Apertium, and STAMP
MIT License
10 stars 2 forks source link

[Rule Assistant] Word insertion not supported for Swedish-German project #666

Closed rmlockwood closed 1 month ago

rmlockwood commented 1 month ago

I'm getting the following error when trying to go from a word with a suffix to two words: Word insertion not currently supported (attempted in Def (remove src word 1)).

I don't know how hard it is to support this. If it is a major re-write we don't need to do it right now.

My hand-modified rule file looks like the attached. Right now, the UI doesn't support this, but let's see if the backend will be able to handle it before we change the UI.

RuleAssistantRules.txt

rmlockwood commented 1 month ago

I don't get the error now, but I get a number suffix that I don't expect.

rmlockwood commented 1 month ago

Now I'm getting choose blocks with only an otherwise block, no when blocks.

mr-martian commented 1 month ago

There aren't any noun affixes in the target project which have number features. I've added a check and a warning for this case.

rmlockwood commented 1 month ago

Updated the German FLEx project in the Google drive. There are still invalid choose blocks for the lemma macro.

rmlockwood commented 1 month ago

I get valid rules now! For the lemma macro I get this: image But this is strange because PL2 in the German source project has only one affix marked as having feature 'pl' and that's PL. The PL2 suffix is not marked with any features. But there is a PL2 in the target Swedish project marked with 'pl'. This is ok, I guess, but what if the suffix coming from the source project is PL? The lemma macro logic won't pick the right lemma. Also, the noun suffix macro also checks for PL2 and will output PL. But PL doesn't exist in the target.

rmlockwood commented 1 month ago

I also learned that not having definiteness ranked gave it a ranking of 0, I guess. So that was being checked first, but then my ranking of number as 1 didn't seem to help it find a plural suffix unmarked for gender. Is this a bug? I would get no-lemma-for-defid-f-pl as in the above comment. But, when I gave definiteness a ranking of 3, it correctly looks at number first and sets the lemma to the plural one and then checks singular and then gender for the other lemmas. Definiteness isn't mentioned which is ok, I guess.

mr-martian commented 1 month ago

The code currently only uses ranking if every feature in the affix has a ranking value.

rmlockwood commented 1 month ago

This is working now.