UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
267 stars 245 forks source link

Improve cross-language consistency of inherently reflexive verbs (compound/expl/dobj) #204

Open dan-zeman opened 8 years ago

dan-zeman commented 8 years ago

Reflexive pronouns may be used instead of normal personal pronouns as objects of verbs and then they are labeled dobj or iobj. However, sometimes a verb requires a reflexive pronoun and it cannot be interpreted as an object. There are at least two different relations used in these situations in different languages. Could one of them be recommended?

More details here:

Inherently reflexive verbs in Universal Dependencies.pdf

jnivre commented 8 years ago

The group in Uppsala had specific recommendations here, but they are not (yet) included in the report. But if I remember correctly, the reflexive should either be "dobj/iobj", in case it corresponds to a real argument, and else "expl" (possibly with subtyping). @mcdm

jnivre commented 8 years ago

My apologies to @mcdm. I looked at the report too quickly and misinterpreted the headings. The report does include a recommendation that is consistent with my earlier comment.

dan-zeman commented 8 years ago

I have removed the “standard needed” label because the standard has been set. However, I am not closing the issue yet because we should verify to what extent the standard is now met in UD 1.2.

dan-zeman commented 8 years ago

For reference: the Uppsala meeting had a discussion group on reflexive pronouns and verbs, and the report is here.

ceramisch commented 6 years ago

I found this old issue related to a current problem discussed by the pt_BR team (@claudiafreitas @vcvpaiva @arademaker). We are discussing the labels to use for representing reflexive clitics. Are the recommendations of the Uppsala meeting included in UD2.0 guidelines somewhere?

A problem not covered by the document above is the distinction between inherently reflexive verbs (the clitic is "part of" the verb) and reflexive clitics used in impersonal alternations (applicable to any verb). @claudiafreitas suggested using a subtype of expl to mark cases where the clitic is part of the verb, as in French se suicider (to suicide) vs. the use of reflexive clitics as impersonal/middle alternation marker, as in la porte s'ouvre (lit. the door self'opens 'the door opens' - unspecified agent).

I personally don't see the need to annotate this distinction in the treebank, since both uses are syntactically similar. I would rather defend that this distinction be added as an extra annotation, such as PARSEME's IRVs.

Any opinions (@savary @nschneid) or recommendations?

jnivre commented 6 years ago

Right or wrong, the current UD guidelines recommend "expl" for inherent reflexives. See: http://universaldependencies.org/u/dep/expl.html

ceramisch commented 6 years ago

Right, but the expl guidelines define inherently reflexive verbs as verbs that cannot occur without the reflexive pronoun and thus the pronoun does not play the role of a normal object. This is too restrictive in my opinion.

I cannot find any explicit recommendation to use expl for: (a) verbs that, when occurring with the reflexive clitic, have a different sense or subcat frame (what we call IRV in the PARSEME guide) and, more importantly, (b) reflexive clitics used as complements of passive/middle/impersonal alternation (different names for similar phenomena).

Of course, this is implied by the definition of expl : These are nominals that appear in an argument position of a predicate but which do not themselves satisfy any of the semantic roles of the predicate.

But wouldn't it be clearer to add an example of impersonal uses too, recommending the use of expl?

dan-zeman commented 6 years ago

I think we should copy the Uppsala recommendations to the current description of expl, although they also talk about relation subtypes. To make things clearer. (I find myself pointing people to the Uppsala report quite often.)

(a) verbs that, when occurring with the reflexive clitic, have a different sense or subcat frame

Implicitly, it can be understood (and indeed is understood in some treebanks, e.g. UD_Czech) as a different verb. Then it may fall under the cited definition of inherently reflexive verbs. (But the borderline of a new sense is sometimes fuzzy.) It would be probably better to mention it explicitly in the documentation.

ceramisch commented 6 years ago

@dan-zeman I see your point about different sense = different verb (and yes, it's a fuzzy line). But it would be good to mention this explicitly.

Also, the use (b) above, of reflexive clitics to indicate middle/passive alternations, is very frequent in some langues (more frequent than inherently reflexive verbs) and it would be good to mention this in the guidelines.

I have checked the consistency of annotations among a sample of Romance languages I can speak/understand and it's quite bad. Probably making the expl page more verbose, including the cases above, could help. Not sure what the procedure to do this would be, though.

jnivre commented 6 years ago

Improving the guidelines is always a good idea, and we need to develop better mechanisms for making this happen. The universal guidelines are the responsibility of the core guidelines group, so we should assign this issue to someone in that group, who can implement the necessary changes once the issue has been resolved. In this case, however, we should also coordinate with the working group on expletives, who is also working on new guidelines.

sylvainkahane commented 6 years ago

Marking the inherent reflexivity is quite semantic, especially when the verb has other senses where the reflexive marker can commute with other words and phrases. I am not sure we should annotate that at the syntactic level of UD. This concerns an annotation of verbal idioms.

The main problem for the syntactic annotation of inherently reflexive verbs which do not have another sense, such as Fr. se souvenir 'remember'. Here we don't know whether se is obj or iobj. Maybe here we could use expl, but expl is problematic from the syntactic point of view (because it is also used for expletive subjects) and from the semantic point of view because se is not what is usually called an expletive (se does not fill a position that could not remain empty). Other problems come from some grammatical uses, such as medio-passive: Ce livre se vend bien, lit. this book REFL sells well. Here 'expletive' would be nonsense because se is the marker of the construction.

If the goal is to avoid cross-language consistency we must avoid to add too subtle distinctions, which are not easily reproductible, even in a monolingual annotation. For the different treebanks of French we discussed the point several times and we arrived at the conclusion (with Marie Candito and Marie-Catherine de Marneffe) that it would be better not to try to distinguish the different uses of se and to use only one relation for all of them. Or at least to allow people not to make the distinction. We planned to discuss the choice of the neutral relation on the UD list. So here we are ;-)

gossebouma commented 6 years ago

The dissertation by Natalie Silveira discusses SE in Romance languages from a UD perspective. She concludes that in all cases mentioned above these should be labeled expl.

claudiafreitas commented 6 years ago

In portuguese, -se as passive voice doesnt exist anymore. We interpret this kind of -se construction as an indication of impersonalization. So, as to -se, both sentences are interpreted in the same way:

    • "Vendem-se casas" ("house for sale"; lit. Sell-se houses) is understood as "someone sells houses"
    • "Precisa-se de casas" (lit. Need-se houses) is understood as "someone needs houses"

For both cases, we are discussing the use of expl:impers, (impers = impersonal), providing a unified approach. We are fine with the other uses of expl, but we think that, for portuguese, the distinction between sentences 1 and 2 above, and the so called "inherent reflexives" ("pure" expl) is a relevant one.