UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
269 stars 245 forks source link

Control nouns and adjectives in enhanced dependencies #687

Open stegrue opened 4 years ago

stegrue commented 4 years ago

The UD guidelines state that in the enhanced representation, additional subject relations are added in control and raising constructions, e.g.:

Mary wants to buy a book -> nsubj(buy, Mary)

However, the guidelines only give examples for control verbs, raising the question of how to handle control nouns and adjectives.

In the EWT corpus, subject relations seem to have been added for control adjectives, e.g. in the following constructions from the training section:

On the other hand, the relations seem to be missing for control nouns (examples from the training section again):

As we are currently performing manual corrections of a subset of the EWT corpus, I wanted to ask what the preferred way of control adjectives and nouns is, and whether we should add the subject relations in the latter case.

nschneid commented 4 years ago

In principle I would like to see them added, though I am not sure how hard it would be to find them all and whether there are any borderline cases.

In all of the above examples the control predicate is the complement of a copula or light verb (have). Is that generally the case?

Also, what about controlled adjuncts? "I go in about every morning to get bagels"—ideally there would be an nsubj:xsubj(get, I) dependency, right? There isn't one now.

nschneid commented 4 years ago

Also, consider this headline: "Legal eagle saves hawk after crashing into window of U.S. attorney’s office in Brooklyn"—syntactically this seems to suggest that "eagle" is the xsubj of "crashing", though that's not actually the intended reading.

nschneid commented 4 years ago

A fun example from EWT: "Same clerk had considerable difficulty taking down a number." Light verb construction + control noun (similar to the Ba'athists sentence above). Currently nsubj(had, clerk), obj(had, difficulty), acl(difficulty, taking). Ideally we want to be able to infer that "clerk" is the enhanced subject of "taking".

amir-zeldes commented 4 years ago

Maybe ccomp(difficulty,taking) is actually better here.

nschneid commented 4 years ago

Maybe ccomp(difficulty,taking) is actually better here.

ccomp:

Note: In earlier versions of SD/USD, complement clauses with nouns like fact or report were also analyzed as ccomp. However, we now analyze them as acl. Hence, ccomp does not appear in nominals. This makes sense, since nominals normally do not take core arguments.

amir-zeldes commented 4 years ago

Yup, though I disagree with this, as I argued in detail here:

https://github.com/UniversalDependencies/docs/issues/308

sylvainkahane commented 4 years ago

Considering that you have an NP [considerable difficulty taking down a number] in the sentence "Same clerk had considerable difficulty taking down a number." is a phrase-structure-based interpretation of the syntactic structure. In dependency-based analyses, you can have other interpretations and consider that every subgraph (or catena) of the dependency tree is potentially a syntactic unit and that in light verb construction the predicative noun and the light verb forms a syntactic unit, which is equivalent to a verbal form. The dependency link we are discussing is then interpreted as the combination between the unit [had difficulty] and its complement [taking down a number] and ccomp becomes much more appropriate than acl.

In the French treebanks we decided to introduce the sub-relation lvc and as soon as a NOUN is obj:lvc or obl:lvc it can have an xcomp or a ccomp: examples.