UniversalDependencies / UD_Dutch-LassySmall

Wikipedia sample from the Lassy Small treebank.
Other
4 stars 2 forks source link

Nominalizations #1

Open gossebouma opened 7 years ago

gossebouma commented 7 years ago

I am struggling with how to annotate (convert from Lassy Small actually) nominalizations like:

Ook zij leren van het elkaar laten zien van nieuwe prestaties Also they learn from the each-other let show of new achievements They as well learn from the letting show to each other of new achievements

[het elkaar laten zien...] is annotated in the original annotation as an NP headed by a verb (laten). full example here

This raises several issues:

  1. obl(laten,elkaar) or nmod(laten,elkaar)? The reflexive 'elkaar' could be labelled nmod if we think of this constituent as nominal, but as obl if we think of it as verbal. UD philosophy suggests obl I would say, as phrasal nodes NP do not even exist in this universe. On the other hand, you might also say that nominalization is a zero-derivation morphological process, and we should really think of the nominalized verb as a noun in this situation. The two conflicting views are present even in the original Lassy annotation, where a verb heads an NP.

  2. advcl(leren,laten) or nmod(leren,laten)? The PP constituent [van het elkaar laten zien....] modifies the verb 'leren'. This could be labelled obl if we think of this constituent as something nominal or as advcl if we think of it as an adverbial clause modifier. Here I have a preference for nmod, as it is a PP headed by an NP after all, but again I suspect UD philosophy might dictate that anything headed by a verb is an advcl.

  3. det(laten,het) or part(laten,het) or ....? The nominalized phrase is introduced by the determiner 'het'. If you want to consider the phrase as a verbal clause, you need to come up with a dependency label for the relationship between the verbal head and the determiner. I think this problem actually might be an argument for thinking of the head as nominal in spite of its POS-label (and so choose nmod(laten,elkaar) and nmod(leren,laten) as well for questions 1 and 2).

Suggestions welcome.

jnivre commented 7 years ago

This is tricky indeed. In many languages, there is sort of a continuum from clearly verbal to clearly nominal. For example:

I remember that John wrote the book I remember John writing the book I remember John's writing the book I remember John's writing of the book

Here I would go with verb for the first three, because there is a direct object, but treat the last one as noun. I am not sure how this compares with the Dutch example, but it seems that the occurrence of the definite article strongly favors a nominal analysis.