surfacesyntacticud / guidelines

Guidelines for Surface Syntactic Universal Dependencies
https://guidelines.surfacesyntacticud.org/
0 stars 0 forks source link

Governors of external dependents of idioms #25

Open perrier54 opened 3 years ago

perrier54 commented 3 years ago

If we think of idioms in their relation to the outside as atoms, their external dependents must be attached to their heads. If we attach these dependents to the word that should have been their governor if we had ignored the specificity of idioms, we are faced with an inconsistency in the choice of dependency labels. Consider the expression si ce n'est le meilleur where si ce n'est is analyzed as a prepositional locution. If we attach the complement le meilleurto est, the label of the dependency should be comp:pred, which is inconsistent with the fact that le meilleur is the object of the prepositional locution si ce n'est, which requires comp:obj as its label. I propose to systematically attach the external dependents of idioms to their head.

sylvainkahane commented 3 years ago

Even idioms can have modifiers that are not attached to the head (cf the PhD thesis of Pausé 2017). Examples: il a la main très verte, il a le bras très long, il a les dents très longues, etc. In this case, I want to have an internal analysis of the idiom AVOIR LA MAIN VERTE and I want to attach TRES to the adjective VERT. If not I would have a super strange structure, with first TRES modifying a verb and second a non-projective structure.

For your example I also want to le meilleur to be dependent of ETRE, because it is the comp:pred of si ce n'est at the surface syntactic level.

Another more complicated example: il a encore fait Dieu sait quelle connerie. Here Dieu sait quel works as a DET but is not connected in our analysis with quel as a dependent of connerie and Dieu sait as the governor of connerie. While as a DET idiom Dieu sait quel should be the dependent of connerie. I don't have a satisfying solution for this case. But for me it shows that there are two levels of analysis: a surface syntactic level where Dieu sait quelle connerie is analyzed as a regular construction and deep syntactic level, where it is seen as a single node DET. UD favors the deep syntactic analysis and SUD rather the surface syntactic analysis.

perrier54 commented 3 years ago

Your examples il a la main très verte, il a le bras très long, il a les dents très longues work very well because there is no break between the internal syntactic structure of idioms and their syntactic integration in the environment. For these cases, I agree with you that the external modifiers can depend on words that are not idiom heads.

Problems arise when the external dependents are arguments of the idiom. For si ce n'est , either it is considered a prepositional locution with ExtPos = ADP and its object is attached to the head, i.e si, or si ce n'est le meilleur is analyzed as an adverbial clause. I don't see how to reconcile the two points of view.

For il a fait encore Dieu seul sait quelle connerie , we can also have a double analysis. We can analyze Dieu seul sait quelle connerie as an ordinary clause but then it is impossible to integrate this analysis in the structure of the whole sentence. In my opinion, the best analysis is to limit the idiom to Dieu seul sait (this justified by Dieu seul sait où for example) and to consider it as a modifier of the noun phrase quelle connerie with ExtPos=ADV.

perrier54 commented 3 years ago

I would like to conclude the discussion with two proposals:

  1. If we consider that the Idiom=Yes feature is reserved for grammatical locutions, which corresponds to the fixed relation in UD, examples such as he has a very green hand, he has a very long arm, he has very long teeth must be annotated as ordinary expressions. Their annotation as idioms must be done at a semantic level, which is beyond the scope of SUD.

  2. The external dependencies of grammatical locutions must be consistent with the external POS of those locutions, which means that they are attached to their head with a label consistent with that POS. For exemple in "si ce n'est le meilleur", "si ce n'est" is a grammatical locution with ADP as its POS and with a comp:obj dependency to "meilleur" and the governor of the dependency is "si".

sylvainkahane commented 3 years ago

In the surface-syntactic structure, le meilleur is comp:pred of est and a parser which don't know this expression will do this analysis for sure. I think that our goal is to reconcile a surface-syntacic analysis and a deep one, where si ce n'est word as an ADP and is viewed as one element. Maybe the GRS must work in two steps: in a first step, it will transform the analysis I propose in your analysis and then convert this to UD. We could do the same thing with Dieu seul sait quelle in il a fait encore Dieu seul sait quelle connerie. The GRS rule will be quite complex but not impossible.

perrier54 commented 1 year ago

see issue surfacesyntacticud/guidelines#39 for further discussion.