Open perrier54 opened 3 years ago
Even idioms can have modifiers that are not attached to the head (cf the PhD thesis of Pausé 2017). Examples: il a la main très verte, il a le bras très long, il a les dents très longues, etc. In this case, I want to have an internal analysis of the idiom AVOIR LA MAIN VERTE and I want to attach TRES to the adjective VERT. If not I would have a super strange structure, with first TRES modifying a verb and second a non-projective structure.
For your example I also want to le meilleur to be dependent of ETRE, because it is the comp:pred of si ce n'est at the surface syntactic level.
Another more complicated example: il a encore fait Dieu sait quelle connerie. Here Dieu sait quel works as a DET but is not connected in our analysis with quel as a dependent of connerie and Dieu sait as the governor of connerie. While as a DET idiom Dieu sait quel should be the dependent of connerie. I don't have a satisfying solution for this case. But for me it shows that there are two levels of analysis: a surface syntactic level where Dieu sait quelle connerie is analyzed as a regular construction and deep syntactic level, where it is seen as a single node DET. UD favors the deep syntactic analysis and SUD rather the surface syntactic analysis.
Your examples il a la main très verte, il a le bras très long, il a les dents très longues work very well because there is no break between the internal syntactic structure of idioms and their syntactic integration in the environment. For these cases, I agree with you that the external modifiers can depend on words that are not idiom heads.
Problems arise when the external dependents are arguments of the idiom. For si ce n'est , either it is considered a prepositional locution with ExtPos = ADP and its object is attached to the head, i.e si, or si ce n'est le meilleur is analyzed as an adverbial clause. I don't see how to reconcile the two points of view.
For il a fait encore Dieu seul sait quelle connerie , we can also have a double analysis. We can analyze Dieu seul sait quelle connerie as an ordinary clause but then it is impossible to integrate this analysis in the structure of the whole sentence. In my opinion, the best analysis is to limit the idiom to Dieu seul sait (this justified by Dieu seul sait où for example) and to consider it as a modifier of the noun phrase quelle connerie with ExtPos=ADV.
I would like to conclude the discussion with two proposals:
If we consider that the Idiom=Yes
feature is reserved for grammatical locutions, which corresponds to the fixed
relation in UD, examples such as he has a very green hand, he has a very long arm, he has very long teeth must be annotated as ordinary expressions. Their annotation as idioms must be done at a semantic level, which is beyond the scope of SUD.
The external dependencies of grammatical locutions must be consistent with the external POS of those locutions, which means that they are attached to their head with a label consistent with that POS. For exemple in "si ce n'est le meilleur", "si ce n'est" is a grammatical locution with ADP as its POS and with a comp:obj
dependency to "meilleur" and the governor of the dependency is "si".
In the surface-syntactic structure, le meilleur is comp:pred of est and a parser which don't know this expression will do this analysis for sure. I think that our goal is to reconcile a surface-syntacic analysis and a deep one, where si ce n'est word as an ADP and is viewed as one element. Maybe the GRS must work in two steps: in a first step, it will transform the analysis I propose in your analysis and then convert this to UD. We could do the same thing with Dieu seul sait quelle in il a fait encore Dieu seul sait quelle connerie. The GRS rule will be quite complex but not impossible.
see issue surfacesyntacticud/guidelines#39 for further discussion.
If we think of idioms in their relation to the outside as atoms, their external dependents must be attached to their heads. If we attach these dependents to the word that should have been their governor if we had ignored the specificity of idioms, we are faced with an inconsistency in the choice of dependency labels. Consider the expression
si ce n'est le meilleur
wheresi ce n'est
is analyzed as a prepositional locution. If we attach the complementle meilleur
toest
, the label of the dependency should becomp:pred
, which is inconsistent with the fact thatle meilleur
is the object of the prepositional locutionsi ce n'est
, which requirescomp:obj
as its label. I propose to systematically attach the external dependents of idioms to their head.