rmlockwood / FLExTrans

Machine Translation using FLEx, Apertium, and STAMP
MIT License
10 stars 2 forks source link

[Rule Assistant] Didn't generate a rule with only the head #725

Closed bbryson closed 5 days ago

bbryson commented 1 month ago

I don't know if I did something wrong, or if this is by design. I have a rule with: Num Adj Noun The rules that came out were for: Num Adj Noun Num Noun Adj Noun

but there was no rule for just Noun. We still need that rule, because in this language, the suffixes that get added to it depend on the gender of the noun. We can't just copy the suffixes from the source. (For other languages, it will be okay to just copy the suffixes.)

16-No Noun Rule

mr-martian commented 1 month ago

I think that's how Ron specified the generate permutations setting.

rmlockwood commented 1 month ago

I think you should do a separate rule for just noun.

rmlockwood commented 1 month ago

Some languages, including the ones we've tested so far, don't need an agreement rule for the head by itself. I guess your language would need it. We could possibly have an additional check box saying 'Include Head' that would be activated when Create All Permutations is checked.

bbryson commented 3 weeks ago

It's not just about agreement; it's also just about 'how do you output a noun' (e.g., deleting features). It seems redundant to have to write a separate rule for that. But yes, any language (including Bantu ones) that have an affix on the noun that is conditioned by that noun's gender, would need this. Also any language where the noun has any kind of case marking.

Originally I thought you were going to allow general Phrase Structure Rule syntax. In the absence of that, I assumed that the PSR you were working with was something like:

(Det) (Quant) (Adj) Noun

That is, that all items are optional except the head. This is something I could explain linguistically, even if I am not sure that is a universal PSR. But I don't know how to specify a PSR for what it is currently doing.

Do you think it would hurt to just go with this PSR? What are the downsides of including "just the head", even for languages that don't appear to need it? If it is important to have the checkbox, can we make it say "omit the head-only rule" and have the default be unticked?

rmlockwood commented 1 week ago

Here’s a specification of a way to handle permutations including a head-only rule User Interface: • Instead of a checkbox that says Create Permutations, a dropdown list of options:

• This information in quotes above goes into the XML file for the value of the create_permutations attribute.

• An additional change is that the threshold for making the Create Permutations control enabled is a phrase of at least two words. (Before it was three words.) Back-end: • For a value of “not-head” the rules are created for all permutations of the source words as is currently being done, i.e. the head word by itself has no rule created for it. • For a value of “with-head” the rules are created as above, except that a rule for just the head word is created. It should enforce agreement if such is needed for the head word itself (such as in Bantu). • For a value of “no” do not create permutations, just create one rule for all the source words.

rmlockwood commented 6 days ago

Daniel has the back-end coded. @AndyBlack, just need your part.

AndyBlack commented 5 days ago

@mr-martian Version 0.30.0 at https://drive.google.com/file/d/1QhHuXZ1tCao06IQFA3BWAjma4z1KV43f/view?usp=sharing has it, with one exception: we need to use underscores instead of hyphens for with_head and not_head. The new DTD is at https://github.com/AndyBlack/ftrulegen/blob/master/FLExTransRuleGenerator.dtd

rmlockwood commented 5 days ago

@mr-martian, I made the change to use an underscore. Seems to work.