GrammaticalFramework / gf-ud

Functions to analyse and manipulate dependency trees, as well as conversions between GF and dependency trees. The main use case is UD (Universal Dependencies), but the code is designed to be completely generic as for annotation scheme. This repository replaces the old gf-contrib/ud2gf code. It is also meant to be used in the 'vd' command of GF and replace the supporting code in gf-core in the future.
Other
7 stars 15 forks source link

Feature request: match lexicon in auxfuns #18

Closed inariksit closed 2 years ago

inariksit commented 2 years ago

We have this standard way of distinguishing between singular and plural the:

#auxcat The DET
#auxfun DetCN_theSg det cn : The -> CN -> NP = DetCN the_Det cn ; det head[Number=Sing]
#auxfun DetCN_thePl det cn : The -> CN -> NP = DetCN thePl_Det cn ; det head[Number=Plur]
#disable the_Det thePl_Det

Now I would like to do the same for other determiners that are ambiguous for number, like some and any. My file is this:

1   any any DET DT  _   2   det _   _
2   word    word    NOUN    NN  Number=Sing 0   root    _   _

1   any any DET DT  _   2   det _   _
2   words   word    NOUN    NN  Number=Plur 0   root    _   _

1   some    some    DET DT  _   2   det _   _
2   word    word    NOUN    NN  Number=Sing 0   root    _   _

1   some    some    DET DT  _   2   det _   _
2   words   word    NOUN    NN  Number=Plur 0   root    _   _

The naive way would be to do something like this. (The is still the auxcat for DET from the previous example.)

#auxfun DetCN_anySg det cn : The -> CN -> NP = DetCN anySg_Det cn ; det head[Number=Sing]
#auxfun DetCN_anyPl det cn : The -> CN -> NP = DetCN anyPl_Det cn ; det head[Number=Plur]
#disable anyPl_Det anySg_Det

#auxfun DetCN_someSg det cn : The -> CN -> NP = DetCN someSg_Det cn ; det head[Number=Sing]
#auxfun DetCN_somePl det cn : The -> CN -> NP = DetCN somePl_Det cn ; det head[Number=Plur]
#disable somePl_Det someSg_Det

With this, I get the following results—the auxfuns didn't seem to do anything. Looking at dt and bt, I see no auxfuns being used.

DetCN anyPl_Det (UseN word_N)
LIN: any words

DetCN anyPl_Det (UseN word_N)
LIN: any words

DetCN somePl_Det (UseN word_N)
LIN: some words

DetCN somePl_Det (UseN word_N)
LIN: some words

So I try to change the auxfuns into this: taking as an argument an actual RGL cat Det, not auxcat The (which corresponds to a DET).

#auxfun DetCN_anySg det cn : Det -> CN -> NP = DetCN anySg_Det cn ; det head[Number=Sing]
#auxfun DetCN_anyPl det cn : Det -> CN -> NP = DetCN anyPl_Det cn ; det head[Number=Plur]
#disable anyPl_Det anySg_Det

#auxfun DetCN_someSg det cn : Det -> CN -> NP = DetCN someSg_Det cn ; det head[Number=Sing]
#auxfun DetCN_somePl det cn : Det -> CN -> NP = DetCN somePl_Det cn ; det head[Number=Plur]
#disable somePl_Det someSg_Det

Now I see, from looking at bt0, that the auxfuns take action:

bt0: DetCN_anySg anyPl_Det (UseN word_N) 
at: DetCN anySg_Det (UseN word_N)
LIN: any word

bt0: DetCN_anyPl anyPl_Det (UseN word_N)
at: DetCN anyPl_Det (UseN word_N)
LIN: any words

But unfortunately, there is no matching with strings, so the auxfun DetCN_any* auxfuns take action even when the actual Det is some*_Det.

bt0: DetCN_anySg somePl_Det (UseN word_N)
at: DetCN anySg_Det (UseN word_N)
LIN: any word

bt0: DetCN_anyPl somePl_Det (UseN word_N)
at: DetCN anyPl_Det (UseN word_N)
LIN: any words

So I would like to enhance the macro DSL such that we can add a wordform or lemma constraint among the tag constraints. For example (feel free to suggest a better syntax)

#auxfun DetCN_anySg det cn : Det -> CN -> NP = DetCN anySg_Det cn ; det head[Number=Sing|wf="any"]
#auxfun DetCN_anyPl det cn : Det -> CN -> NP = DetCN anyPl_Det cn ; det head[Number=Plur|wf="any"]
#disable anyPl_Det anySg_Det

#auxfun DetCN_someSg det cn : Det -> CN -> NP = DetCN someSg_Det cn ; det head[Number=Sing|wf="some"]
#auxfun DetCN_somePl det cn : Det -> CN -> NP = DetCN somePl_Det cn ; det head[Number=Plur|wf="some"]
#disable somePl_Det someSg_Det