UniversalDependencies / UD_Russian-SynTagRus

Russian data from the SynTagRus corpus.
Other
80 stars 8 forks source link

Annotation of examples of the type "дай чем писать" #14

Open ftyers opened 7 years ago

ftyers commented 7 years ago

We are wondering how best to analyse

Заказывай 
что 
хочешь

Also:

Дай  
чем
писать

It's a bit like the Spanish "Pide que quieres" or English "Choose what(ever) you want"

I found this in the Spanish training data "no sabes que regalar" (you don't know what to buy):

# sent_id = es-train-002-s241
# text = Tienen sin duda la mejor seleccion de libros de regalo, de todos los tamaños precios e idiomas, la utilizo para resolver esos momentos en los que no sabes que regalar, pues entro y ya no tengo dudas, y eso me gusta.
...
29      no      no      ADV     _       Polarity=Neg    30      advmod  _       _
30      sabes   saber   VERB    _       Mood=Ind|Number=Sing|Person=2|Tense=Pres|VerbForm=Fin   25      acl:relcl       _       _
31      que     que     PRON    _       PronType=Int,Rel        32      obj     _       _
32      regalar regalar VERB    _       VerbForm=Inf    30      ccomp   _       SpaceAfter=No

In the English treebank:

# sent_id = newsgroup-groups.google.com_humanities.lit.authors.shakespeare_0c155162a7dfaf28_ENG_20031127_172200-0045
# text = If sites next to you don't have what you want, contact your nearest comp.sources.unix archive, or the moderator.
1       If      if      SCONJ   IN      _       8       mark    _       _
2       sites   site    NOUN    NNS     Number=Plur     8       nsubj   _       _
3       next    next    ADP     IN      _       5       case    _       _
4       to      to      ADP     IN      _       5       case    _       _
5       you     you     PRON    PRP     Case=Acc|Person=2|PronType=Prs  2       nmod    _       _
6       do      do      AUX     VBP     Mood=Ind|Tense=Pres|VerbForm=Fin        8       aux     _       SpaceAfter=No
7       n't     not     PART    RB      _       8       advmod  _       _
8       have    have    VERB    VB      VerbForm=Inf    13      advcl   _       _
9       what    what    PRON    WP      PronType=Int    8       obj     _       _
10      you     you     PRON    PRP     Case=Nom|Person=2|PronType=Prs  11      nsubj   _       _
11      want    want    VERB    VBP     Mood=Ind|Tense=Pres|VerbForm=Fin        9       acl:relcl       _       SpaceAfter=No

One possibility would be

obj(Дай, чем)
acl(чем, писать)

But that feels a bit odd given the case, if we assume there is a "то" that is elided then we would get (with promotion of the verb heading the relative clause in the absence of what it refers to):

obj(Дай, пишешь)
obl(писать, чем)

But that is weird because then you have obj between two verbs. Any thoughts ?

dan-zeman commented 7 years ago

I think that the subordinate predicate should get higher priority when fighting for the "shared object", i.e. the relative pronoun. This is how it is done in the Prague Dependency Treebank, among others. Then the relation between the two predicates is ccomp. This is of course a consequence of a particular conversion procedure (the original PDT has its own set of deprels), and one might protest that the superordinate verb does not subcategorize for clausal complements. But I think it is sort of OK if we look at ccomp as any clausal realization of what would be an obj if realized as a nominal.

Here is a query showing examples in UD_Czech: http://hdl.handle.net/11346/PMLTQ-6JWO

Your proposal of obj(Дай, пишешь) would be my second choice if ccomp was not possible. You could explain it by promotion of the subordinate predicate to the position of an elided correlative pronoun (Дай то, чем пишешь / Daj to, čem pišeš') lit. “Give that with-what you-write.”

ftyers commented 7 years ago

:D I used that same example to explain it to, although apparently that is never used (although it seems grammatical).

So, the downside with using ccomp is that it isn't clear that there is an "ellipsis" there. I'm not sure if there are any examples where there would really be an ambiguity, but perhaps a language-specific relation should be used.

dan-zeman commented 7 years ago

In UD it is rarely clear that there is an ellipsis in a sentence :-)

I really see ccomp as an obj that is clausal. Much like an analogy to the nsubjcsubj distinction.

Here are some Czech examples of the relative-correlative pattern, so it is used at least in Czech: http://hdl.handle.net/11346/PMLTQ-DWEW

A few more examples of the previous query in other languages:

Ukrainian: http://hdl.handle.net/11346/PMLTQ-0JHN Polish: http://hdl.handle.net/11346/PMLTQ-SUZH Slovak: http://hdl.handle.net/11346/PMLTQ-1VMC Slovenian: http://hdl.handle.net/11346/PMLTQ-JUAU Croatian: http://hdl.handle.net/11346/PMLTQ-6T5J Latin: http://hdl.handle.net/11346/PMLTQ-SLHD

a-node $v := [tag="VERB", parent a-node $p := [tag="VERB"], deprel!~"root|conj|parataxis|xcomp|csubj|advcl", child a-node $o := [tag~"PRON|DET", iset/prontype~"rel", deprel="obj", iset/case="acc"]];

dan-zeman commented 7 years ago

If we say that it should be obj (and not ccomp) then I don't know how to detect cases that should be obj (and not ccomp). With the current approach, I can look at the child node, and if it heads a clause, then it is ccomp.

ftyers commented 7 years ago

More examples from Olga:

  1. Дай кому хочешь
  2. Пиши чем хочешь

Here we have an example in (1) where the case of кому isn't subcategorised for by the subordinate verb. And in (2) it doesn't "feel like" it should be ccomp as it seems to fit more with advcl / manner. The example in (2) can be dealt with quite nicely with advcl, but in (1) it is not clear. How is this done in Czech ?

dan-zeman commented 7 years ago

Wow, interesting :-) My intuition would be to treat them as elliptical for Daj tomu, komu hočeš dať and Piši tem, čem hočeš pisať, respectively. So example (2) would be a clausal realization of obl, i.e. advcl, and (1) would be a clausal realization of iobj, for which I'm not sure whether ccomp is a good label, but I cannot think of a better one.

I will try to locate some similar examples in the Czech treebank and get back here if I succeed.

dan-zeman commented 7 years ago

So I don't seem to be able to find anything directly equivalent in the Czech data. This would be a roughly corresponding query: http://hdl.handle.net/11346/PMLTQ-WZGT

But the 8 results I'm getting are somewhat different, the matrix verb does not subcategorize for a dative itself. But at least in one case it subcategorizes for a non-accusative, and the clause is currently ccomp.

ftyers commented 7 years ago

So it seems that this only works with "modal" verbs like хотеть/мочь and could be considered a case of ellipsis like you suggest, e.g.

  1. Дай кому хочешь (дать)
  2. Дай кому можешь (дать)
  3. *Дай кому видишь (дать)

    @olesar what do you think ?

olesar commented 7 years ago

ОК: возьми/дай что видишь (both take and see take Acc here) ОК: передай кому скажут (передать) - ie. not only "modal" verbs, but any matrix verb (taking infinitive) *Дай кому видишь (дать) - yes, it seems that we have an elliptical case here

amir-zeldes commented 7 years ago

I'm having a very similar issue with English, so I thought to ask it here, maybe @sebschu or @nschneid or others working on English would want to chime in. How about this one:

You should fit them to where the drawstring is tied slightly loose.

The problem is the 'where' - if it were nominal, it would be obl, not obj, so I'm not happy with the ccomp solution. On the other hand, the clause-level adverbial corresponding to obl would be advcl, and that seems pretty strange too:

advcl(fit,tied) ?? ccomp(fit,tied) ??

Or do we treat 'where' as an NP argument, and use acl? This is also a bit weird:

obl(fit,where) acl(where,tied) ??

Then we lose the internal function of 'where the drawstring is tied', which logically is a clause. Thoughts?

sebschu commented 7 years ago

@amir-zeldes This looks to me like an example of a free relative, so I'd basically go with the second analysis (using the relative clause subtype of acl):

obl(fit,where) acl:relcl(where,tied)

Some arguments for this analysis are in section 4.2 of de Marneffe et al. (2013).

@ftyers I don't know any Russian (or Czech), so I wasn't really able to follow the discussion and I'm not sure whether these examples are also free relatives, but if they behave somewhat similar to something like "Choose whatever you want", then I'd be inclined to say that they are free relatives and should be analyzed analogously to the examples in English.

amir-zeldes commented 7 years ago

OK, thanks - I think the obliqueness threw me off. @loganpeng1992 : see @sebschu 's answer and references.

dan-zeman commented 7 years ago

@sebschu : I am not familiar with the term free relative, which could be caused by my primarily non-linguistic background, or by the fact that the term is not used in the languages where I am at home. But the case marking morphology speaks quite clearly against the English-like analysis (which probably explains why the analysis seems wrong to me even in English; but I guess you can afford doing it your way since you don't have cases :-)).

sebschu commented 7 years ago

@dan-zeman Yes, it could well be that this a very English-centric concept (and that we get away with it or at least don't have contrary evidence because we don't have cases :)) but I'd be surprised if there weren't any selectional restrictions on the type of question word in other languages. And at least in German, it seems like the case of the relative pronoun has to match the case of the object that it replaces and the case of the complement it replaces within the relative clause, so that seems compatible with the English analysis.

One of the main arguments to not treat this as an embedded clause is that the wh-word is typically restricted in these cases.

For example, we can't change wherever you see a free spot to a clause with another wh-word:

Put your bag down wherever you see a spot. *Put your bag down what table Al put his bag on.

dan-zeman commented 7 years ago

@sebschu on a second thought - I'm sorry, I was too quick with the previous post, without reviewing the Russian examples above. So I remembered it was about different case-subcategorization of the two verbs, but actually, the daj komu hočeš example would speak for the English-like analysis, because it's the matrix verb which requires the dative. So it does not explain why I don't feel comfortable with the English analysis, although I cannot say it makes me feel significantly better about it :) At least when a modal verb is used in the embedded clause, it does feel like ellipsis.

jnivre commented 7 years ago

Free relatives are definitely not English-specific. We have them in Swedish too. :)

In general, they can be analysed as merging the antecedent and the relative pronoun into a single word:

what(ever) you want = anything (which) you can think of who(ever) you want = anyone (who) you can think of where(ever) you want = anywhere (that) that you want

So, in some sense, the wh-word does double duty as an argument/modifer in the higher clause and inside the relative clause. If we give priority to the higher clause relation (in the basic dependencies), then their analysis becomes parallel to relative clauses with omitted pronouns:

take what(ever) you like obj(take, whatever) acl:relcl(whatever, like) nsubj(like, you)

take the thing that you like obj(take, thing) acl:relcl(think, like) nsubj(like, you) obj(like, that)

take the thing you like obj(take, thing) acl:relcl(thing, like) nsubj(like, you)