UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
201 stars 43 forks source link

Errors in edeps: double subjects/objects; and iobj vs. obj when only the recipient argument of a ditransitive verb is overt #256

Open nschneid opened 3 years ago

nschneid commented 3 years ago
amir-zeldes commented 3 years ago

Thanks for catching the ones in the first and third categories, most of the GUM cases were plain errors. I'm curious what you think about this one though:

For 2 and 4, GUM definitely allows multiple subjects due to the policy you mentioned, and the last 6 are correct IMO (one of them has an error/non-standard syntax so I'm not sure what the best decision is there). Which one do you think is an error?

nschneid commented 3 years ago
amir-zeldes commented 3 years ago

charge/pay me $10: "me" should be iobj

Mm, maybe... But if that's right, then "pay me", and worse(?) "cost me" by themselves should also be iobj, right? My hesitation is probably due to German influence, because some of these are double ACC in German ("das kostete mich die 20 Euro" -> cost + me.acc + 20-euros.acc)

verbs embedded in a relative clause ("things you can offer a person", "songs my mother taught me", "what you gave the neighbor"?) should have an E:obj

Oh, sorry, yes, you meant the edeps, it was right there in the subject! I thought for a second you were saying "me" in "taught me" should be obj. For zero relatives GUM currently doesn't have an edep back to the matrix head, primarily because it can be hard to guess the deprel from the clause by itself (if there is a WDT, then the deprel is whatever it is). I guess we could try guessing that these are object relatives, but how often will that be wrong? For example, if it were "People my mother taught songs", it should be an edep E:iobj right?

"paying people to write and edit articles"

I see, I missed that (I thought again you meant people should be obj). It's a bit borderline, but I agree it should probably be xcomp. I think the policy is, if it's a free purpose adverbial infinitive, it's advcl and not xcomp, because it's not a complement of the matrix verb, and we teach annotators to test this with "in order to". So you can say "I bought it (in order) to have it at home", and that would be advcl instead of xcomp. But here it's a bit ambiguous - you could kind of say "paying people in order to write articles" (i.e. in order to get them written). On the other hand, I think there is a kind of expected argument structure "pay someone to do something", in which case I agree it's more likely this is that than a free purpose adverbial. I can change that one.

nschneid commented 3 years ago

charge/pay me $10: "me" should be iobj

Mm, maybe... But if that's right, then "pay me", and worse(?) "cost me" by themselves should also be iobj, right? My hesitation is probably due to German influence, because some of these are double ACC in German ("das kostete mich die 20 Euro" -> cost + me.acc + 20-euros.acc)

Actually I now see in the obj guidelines:

In general, if there is just one object, it should be labeled obj, regardless of the morphological case or semantic role. For example, in English, teach can take either the subject matter or the recipient as the only object, and in both cases it would be analyzed as the obj

EWT follows this (no exception for communication verbs where the recipient but not the theme is overt).

verbs embedded in a relative clause ("things you can offer a person", "songs my mother taught me", "what you gave the neighbor"?) should have an E:obj

Oh, sorry, yes, you meant the edeps, it was right there in the subject! I thought for a second you were saying "me" in "taught me" should be obj. For zero relatives GUM currently doesn't have an edep back to the matrix head, primarily because it can be hard to guess the deprel from the clause by itself (if there is a WDT, then the deprel is whatever it is). I guess we could try guessing that these are object relatives, but how often will that be wrong? For example, if it were "People my mother taught songs", it should be an edep E:iobj right?

Yep. This ambiguity is part of why we want to have the edeprels, I think.

"paying people to write and edit articles"

I see, I missed that (I thought again you meant people should be obj). It's a bit borderline, but I agree it should probably be xcomp. I think the policy is, if it's a free purpose adverbial infinitive, it's advcl and not xcomp, because it's not a complement of the matrix verb, and we teach annotators to test this with "in order to". So you can say "I bought it (in order) to have it at home", and that would be advcl instead of xcomp. But here it's a bit ambiguous - you could kind of say "paying people in order to write articles" (i.e. in order to get them written). On the other hand, I think there is a kind of expected argument structure "pay someone to do something", in which case I agree it's more likely this is that than a free purpose adverbial. I can change that one.

Agreed

amir-zeldes commented 3 years ago

no exception for communication verbs where the recipient but not the theme is overt

Yes, I think I argued about this with Adam P. a few years ago at length; I think applying this to GUM would be a terrible waste of information, and for languages with overt dative marking I think it's especially bizarre to label dative objects as "obj" and not "iobj" just because the direct object is not realized. That's like saying objects should be labeled nsubj if we have pro drop, so in "got it", there is only one argument, therefore "it" should be "nsubj" regardless of case. I think if EWT really stays this way, then this will have to be a point on which the two corpora differ.

Yep. This ambiguity is part of why we want to have the edeprels, I think.

Agreed, but then I think it might be a bit too dangerous to auto-apply the object analysis. I'm leaving this out of edep generation for GUM for the moment.

Agreed

Fixed, thanks!

nschneid commented 3 years ago

Oh, so GUM edeps are exclusively autogenerated? You don't want to autogenerate them once (when the sentence is added to the corpus) and allow manual corrections after that?

To give a minimal pair illustrating the ambiguity:

nschneid commented 3 years ago

no exception for communication verbs where the recipient but not the theme is overt

Yes, I think I argued about this with Adam P. a few years ago at length; I think applying this to GUM would be a terrible waste of information, and for languages with overt dative marking I think it's especially bizarre to label dative objects as "obj" and not "iobj" just because the direct object is not realized. That's like saying objects should be labeled nsubj if we have pro drop, so in "got it", there is only one argument, therefore "it" should be "nsubj" regardless of case. I think if EWT really stays this way, then this will have to be a point on which the two corpora differ.

For English, if we use synchronic criteria, it seems our two options for interpreting iobj are (a) an object that precedes another overt object, or (b) an object with thematic role of recipient for a verb that licenses both recipient and theme objects (whether or not the theme object is overt). I assume this was debated and resolved in favor of (a). As you point out, option (b) would be weird for "cost", because "me" is not obviously a recipient in "it costs me $5".

amir-zeldes commented 3 years ago

Oh, so GUM edeps are exclusively autogenerated?

No, you can specify edeps and morphology manually in the source files upstream, and the edep generation will always prioritize these over automatically computed ones. This always happens when there is an orphan relation or empty node, and in some other cases that have bothered us over time. This is how it looks upstream. We could try to add these for such relatives as well, but since edeps are still evolving and most are automatable, we haven't been investing too much effort in manually annotating them (if standards change it's easier to adjust the auto-generation ATM).

iobj are (a) an object that precedes another overt object

That seems problematic, and not followed in practice in many TBs. If that is really what we want, would you say it's also correct under topicalization?

And in varieties that allow accusative pronouns before dative ones, would you say this tree is right?

And this is definitely not followed in languages such as German, which has case marking (but not prepositions in ditransitives, like English):

This is even true in syncretic cases, where dative and accusative look alike. I don't understand why we would want for the labels iobj and obj to code relative order after the verb. Isn't the point of dependencies to express grammatical function?

nschneid commented 3 years ago

Twitter is very much split on this: https://twitter.com/complingy/status/1453012931587751941

I personally don't have a strong opinion other than not wanting to reexamine all of the monotransitive objs in EWT without a really clear consensus that the current guideline is suboptimal.

amir-zeldes commented 3 years ago

The twitter question was about "I will pay you", which is a non-prototypical case IMO and harder to get naive judgments about, so I can understand people would be divided. I think the question can skew the response - if you ask whether "I told a story" and "I told someone" are the same grammatical function you may get different responses...

Either way, I am very reluctant to destroy this information in GUM, and for what it's worth I just checked, and all three Polish corpora have iobj+obj in either order, as well as iobj without obj (some iobj cases are dative, some are actually instrumentals); all 5 Czech corpora: same; 3/4 German corpora: same (one corpus does not have iobj by itself).

I'm also pretty sure that in EWT order is never a factor (in English it's 99% iobj before obj anyway), and the single iobj cases could be recognized with high accuracy by a parser trained on GUM/rules using the verbs in question/NER checking for a human obj of a known ditransitive verb. I can even volunteer to run the prediction and submit a PR, if you agree this is the better analysis.

nschneid commented 3 years ago

I think we need to hear why the current policy was adopted in the first place; it seems a very deliberate choice. @manning @jnivre @dan-zeman all contributed to the iobj page.

amir-zeldes commented 3 years ago

Agreed, let's bring this up next time we meet! I'm definitely willing to put in the work for EWT if we can agree this would be useful.

dan-zeman commented 3 years ago

I think that anything you say about iobj vs. obj has to be language specific. So if iobj is always the first of two objects in English, there is no implication for German or any other language.

Furthermore, if you want to say that dative nominals are iobj, you first have to establish that they are core arguments. The more I've been looking at this, the more I think that most (all?) Indo-European datives that I've seen should be treated as oblique. (Which is not necessarily reflected in current UD data, but I believe it should.) If that's true, then the only chance for an iobj in German and Slavic is in double-accusative constructions.

amir-zeldes commented 3 years ago

Thanks @dan-zeman - if it's language specific then I would like to retain the distinction, and I think for English the test should be the dative alternation (so we won't confuse it with other double objects, which don't participate in the alternation and are usually xcomp, e.g. naming/appointing constructions).

The term iobj also has the advantage of being distinct from both obj and other obliques (which are generally characterized by prepositional marking in English), and it is in line with traditional grammar, which helps when communicating with linguists, philologists and other humanities scholars.

dan-zeman commented 3 years ago

The term iobj also has the advantage of being distinct from both obj and other obliques

Well, but in UD, these are not "other obliques" because iobj is by definition something that is not oblique.

(which are generally characterized by prepositional marking in English), and it is in line with traditional grammar, which helps when communicating with linguists, philologists and other humanities scholars.

Only in some languages, unfortunately. I was never a fan of iobj, and to large part probably because it is not in the traditional grammar I've been exposed to, and it is hard to define in Slavic languages.

Finally, the very term "object" is also defined quite differently in traditional grammar than in UD (there are prepositional objects, for example, and they are not rare in the traditional grammar, but it is quite rare to see a prepositional object in UD because it is relatively rare that a core argument is marked by an adposition).

nschneid commented 3 years ago

For completeness here's the CGEL passage cited in the guidelines:

image

nschneid commented 3 years ago

I guess the crux of the CGEL argument is that people are more widely ready to judge "The subject which she taught the students", "The key which he lent me" (DO extraction) as acceptable than "The students who(m) she taught math", "The one who(m) I lent the key" (IO extraction), but there is no such distinction for "The subject which she taught" vs. "The students who(m) she taught" (both readily acceptable). It is noted (p. 249) that speakers vary widely in their judgments but claimed they are consistent in their relative judgments.

TBH these all seem OK to me. I wonder if the supposedly worse examples are just more difficult to process, and whether that's the same thing as being less acceptable.

amir-zeldes commented 3 years ago

I think who(m) usage varies quite a lot across speakers and varieties, so this doesn't seem compelling to me. I'm always more suspicious of arguments that stipulate "it is uncontroversial" rather than doing a balanced corpus study, but I suppose that's not possible in a reference grammar. FWIW, our own syntactician Ruth Kramer confirmed my suspicion that 'garden variety' theoretical syntax assumes a different base-generation locus for indirect and direct object, regardless of what we do with them later (incl. movement, passivization or whatever). This is admittedly a generative-centric view, but I can easily find hard-core HPSG people talking about indirect objects, e.g. Stefan Müller.

Well, but in UD, these are not "other obliques"

@dan-zeman sorry, I should have phrased that better: iobj lets us distinguish these cases from both obliques and obj without introducing a new label, not to mention staying true to the label which has been in use since Stanford Dependencies, through UD 1 and 2.

There is a pretty extensive discussion of the literature on this subject in Mukherjee's book, see chapter 1 here:

https://www.google.com/books/edition/English_Ditransitive_Verbs/jd0eEAAAQBAJ?hl=en&gbpv=1

He surveys lots of different grammars on this and basically takes the position that if only one ditransitive argument is realized overtly, it's because the other one is inferable, so there is no real difference in their roles together or apart (he is also fairly constructivist, so he sees this as realizing the valency of the ditransitive construction).

As for traditional grammar in other languages, I think calling ditransitive datives "indirect objects" is fairly widespread. Here are some examples:

Here is an early UD (v1) treatment of Russian indirect objects as iobj:

https://www.google.com/books/edition/Advances_in_Natural_Language_Processing/HDJwBAAAQBAJ?hl=en&gbpv=1&dq=Russian+indirect+object+dative&pg=PA159&printsec=frontcover

So while I wouldn't say everyone uses this term, it is quite widespread in my opinion, not limited to the English tradition and even applied outside of Indo-European.

dan-zeman commented 3 years ago

Yes, people use the term (one may question which of the use cases qualify as "traditional grammar", as that term is inherently vague) but my concern is whether it is well defined, and if it is, how it is defined within their theory. I find it quite difficult to come up with a cross-linguistically applicable definition which would also be compatible with UD.

If ditransitive datives (i.e., those in clauses where there is another non-subject core argument which is not in the dative) are core arguments, then what do we do with verbs that take a single dative "object"? It would make sense to keep them as core as well. But they behave differently than proto-patients, and sometimes express semantic roles that are typically associated with adjuncts, such as the beneficiary.

nschneid commented 2 years ago

The interpretation of obj vs. iobj is also discussed in UniversalDependencies/docs#832

nschneid commented 1 year ago

I realized we need to decide the case of verbs like instruct, inform, urge, and advise, which do not license the double object but do license object+clause. Assuming "The teacher taught the students." is iobj, should it be obj or iobj for:

And if the answer is that these are iobj, that is specifically because the verb licenses the ccomp (3rd one), right?* So tutored would fail because you can't say "tutored that students that..."? (We don't want to say that the object of every object control verb is automatically iobj.)

* urge + recipient + that-clause sounds marginal to me but there is an attestation in EWT

nschneid commented 1 year ago

@amir-zeldes

nschneid commented 1 year ago

I note there is an example in the main guidelines of "persuade someone to do something" as a control verb with obj + xcomp. As I understand the policy regarding ccomp taking the place of an obj, this would have to become iobj + xcomp, because you can persuade someone that something is true (iobj + ccomp). Even though "persuade" doesn't participate in the double object construction.

amir-zeldes commented 1 year ago

The "that" case seems clear, and I think there was concensus from the core group to treat ccomp as one of the obj candidates for the double obj prohibition. So yes, for that I would go with iobj.

For the xcomp case I'm not passionate about it one way or the other. This paper suggests that iobj + xcomp should trigger enhanced xsubj, so there are clearly some UD users who would expect iobj to be possible in these constructions.

Yeah, I'm convincible, but I guess I like maximizing the expressiveness of the labels we have, so if nobody objects strongly then I am fine with iobj if the xcomp dependent alternates with a ccomp in English (that's also a fairly tidy formal criterion, and semantics-free). But I don't take xcomp+obj in itself to be a violation of double object, I think it's the canonical analysis of causatives of the 'acc. cum inf.' type in UD.

nschneid commented 1 year ago

Yeah I think it would surprise people to say "want him to leave" should be iobj + xcomp rather than obj + xcomp. So we should allow obj + xcomp for verbs like "tutor" that license neither double-object nor object+ccomp.

nschneid commented 1 year ago

The paper you linked raises a combination I hadn't considered: ccomp taking the place of iobj, in something like "We should give [that you are new here] [due consideration]". So ccomp + obj should be valid, just not obj + ccomp?

amir-zeldes commented 1 year ago

Well, technically that's an icomp, if we were using those. Usually UD distinguishes the same things for nominal and clausal dependents with parallel labels, but this one has been neglected, probably because it is so rare. If that ever turns up in our data I guess we would have to deal with that...

nschneid commented 1 year ago

Honestly I'm wondering if we should just keep it simple and limit iobj to verbs that license the double object construction with actual nominals. Otherwise we are defining iobj not in terms of one alternation, but in terms of an OR of two alternations: there are verbs like persuade which license object+ccomp, verbs like give which license double object, and verbs like teach which license both. And lots of examples floating around like figure 2 of the above paper where persuade, promise etc. have obj rather than iobj. [UPDATE: Though we have a strong argument for promise: it licenses the double object construction.]

amir-zeldes commented 1 year ago

I understand where you're coming from, but treating obj and ccomp differently is a big price to pay IMO. For communication verbs it seems particularly jarring (I think "Kim" is equally iobj in "I told Kim the story" or "that I was hungry"), and we can no longer rely on 'no-double-obj' in validation, or generally the idea that things like pronouns can stand for both nominals and clauses (which they regularly do).

So overall my tendency would be to treat ccomp and obj the same for the purpose of identifying 'double object' constructions. In practice I don't think I've ever seen an 'icomp' in data, but if we saw one I would either make it a one off exception to obj+ccomp, or subtype, or find some other workaround. I wouldn't let the lack of a label like icomp derail the otherwise mostly elegant and useful parity of UD clause and nominal handling.

nschneid commented 1 year ago

For communication verbs it seems particularly jarring (I think "Kim" is equally iobj in "I told Kim the story" or "that I was hungry")

I agree—because "tell" licenses the double object construction, we can say "Kim" is iobj whether it's tell Kim the story, tell Kim that..., tell Kim to..., etc.

My worry about treating all ccomps as object-like is that it will surprise people to see persuade, promise, etc. with an iobj. And it's another potential paraphrase that annotators have to consider as a criterion when determining whether a verb licenses iobj. With instruct and urge, for instance, a ccomp may be possible but not canonical, and this could trip up annotators looking at I instructed them in the topic, for example.

nschneid commented 1 year ago

@jnivre I think you raised the point about obj+ccomp for verbs like tell: would you prefer iobj also for verbs like instruct, urge, and persuade?

amir-zeldes commented 1 year ago

I'm not sure we want to change 'persuade X to VERB' to have iobj based on the behavior of 'persuade that' (we might consider these to be different lexical entries/senses/argument structures), but for "persuade someone that something..." I think we should be consistent and use iobj for 'someone' (since "that something" already occupies the core direct object slot).

nschneid commented 1 year ago

If "someone" in "persuade someone to leave" is obj, would you say that "someone" in "tell someone to leave" should also be obj?

amir-zeldes commented 1 year ago

I guess that could be OK. I mean, can you say "tell to leave to someone"? I think basically our choice boils down to whether we consider that/to clause alternation to belong to 'the same verb'. If it's the same 'tell' (or persuade, etc.) then it makes more sense to want to keep iobj for the same dependent across constructions. But if we think the valency belongs to a different version of that verb, then that argument falls away and there is no longer a justification for iobj there. I agree that persuade and tell should behave the same way (both can take NP + that clause, both can take NP + to infinitive, the xcomp implies argument sharing for both infinitive structures).

nschneid commented 1 year ago

I don't think they're different senses of the verb. Note that you can't say "tell that it's a bad idea to someone" either; the dative alternation only works with two objects vs. object+prepositional object, not object + clause.

nschneid commented 1 year ago

My take at a high level is that there's a good reason for a whole literature on argument structure: which verbs license which complementation/subcategorization patterns (and how these are linked to semantic roles) is pretty complicated, and largely not derivable by transformations—rather, a list of complementation patterns needs to be specified in a lexical entry for each verb.

While it may be intuitive for some verbs to think of a ccomp and a direct object as corresponding to the same valency slot—or the indirect object in a double object construction and the alternating recipient-object in a non-prepositional monotransitive—we cannot treat one as universally alternating with the other: there are verbs that license double-object but not recipient-monotransitive, and vice versa; there are verbs that license double-object but not obj+ccomp, and vice versa.

Double-object vs. prepositional dative construction is a pretty regular alternation, so I think we're on safe ground treating that as the test for iobj when there are two (or potentially two) nominals. We could formulate the test in terms of whether the verb licenses that alternation in general (even if the sentence in question has a complement clause, so as to make it iobj for tell someone that...). But if we want to prohibit all obj+ccomp combinations, and also to avoid changing an object's deprel when another complement is added, it will create a proliferation of iobj uses for verbs like persuade that license neither the double object nor the prepositional dative.

amir-zeldes commented 1 year ago

Well, the simplest solution as you say is to limit iobj to verbs that can take two nominals, and it seems those should be admitted either way. But I'm not sure that means that we should ever allow obj+ccomp on an actually attested case. In other words, I think we can make the infinitive cases all take obj (acc cum inf), but still do iobj+ccomp for that-clauses. I think the 'no double obj' restriction has largely proven itself in UD, and I'd like it to cover obj+ccomp (it's revealed a bunch of annotation errors in the past)

nschneid commented 1 year ago

Then what would you do for "I persuaded someone"? If you say that's obj, but "I persuaded someone that I was right" is iobj, then you lose the invariant that adding another complement shouldn't change the deprel. And the same goes with "I persuaded someone to leave".

amir-zeldes commented 1 year ago

Good point. Then I guess the most consistent thing to do is iobj whenever either a second nominal (with prep. alternation) or a that clause is possible. That only leaves the question of what to do if the infinitive version is attested. Would you want iobj or obj there?

nschneid commented 1 year ago

I think the infinitive xcomp should not force iobj with cases like "I want you to leave": that should be obj. But I think it's fine to have it as iobj with "tell someone to leave", because it alternates with "tell someone that they should leave" and "tell someone the information".

But this will mean that there are a lot of verbs that take iobj as the sole complement (or with an xcomp), perhaps unintuitively: persuade, inform, urge, ....

jnivre commented 1 year ago

Sorry for being slow to the party. As usual, there seem to be more corner cases to work out than you expect at first. I need to think this through more thoroughly, but it seems that something along the lines you suggest will be a good enough compromise.

I agree that any verb that takes two nominals and also allows the dative alternation should have iobj. I also agree that xcomp should not force iobj in the classic accusative cum infinitive cases. I am less certain about the ccomp cases, especially for verbs like “persuade”, where “persuade someone that p” is roughly equivalent to “persuade someone to believe that p”, whereas “telk someone that p” is not quite the same as “tell someone to believe that p”. This may be too subtle, but it may be worth considering iobj+ccomp for some verbs and obj+ccomp for others, although I realize that this may not be beneficial for annotation consistency.

amir-zeldes commented 1 year ago

Personally I think the benefit of preventing obj+ccomp in English outweighs the rare uncomfortable cases that may result. Cases of iobj + ccomp are extremely common, while "persuade", "urge" or "inform" with obj and no clause are quite rare - there are only two cases in GUM, both with 'inform', and both in a sense that arguably precludes a clause:

EWT has only one case, also with 'inform', in which I think it is reasonable to assume that a 'that'-clause would have been possible:

If the guideline requires that for iobj annotation the 'that' clause needs to be possible, then only the last case would receive iobj, which seems fine to me.

nschneid commented 1 year ago

For persuade, urge, inform, and promise, there are 17 in EWT with an obj and no ccomp—many of these have an xcomp: http://universal.grew.fr/?custom=63d2e7568cf3f [UPDATE: I realized promise isn't a good example here since you can promise somebody something. 11 EWT occurences of urge, inform, persuade.] If "I persuaded him" should be iobj, then "I persuaded him to eat" should also be iobj I think. So most of the results would have to be changed: obj -> iobj.

Admittedly that's not good evidence of annotators' intuitions about obj vs. iobj because EWT was converted with the no-monotransitive-iobj policy. In GUM, there are 8 matches, and the 6 ones with persuade or urge would have to be changed under this policy (none of the GUM annotators thought persuade or urge without a ccomp should be iobj).

But these are just examples of verbs of this type—there are potentially many others.

At the end of the day, I think we're going to need a verb lexicon in the validator to check which should have iobj and which shouldn't. Are you saying obj+ccomp is important to prevent for theoretical reasons, or for practical reasons with more frequent verbs? Because we can easily list those frequent verbs in the validator. Harder to list every verb that might take an object with a that-complement.

amir-zeldes commented 1 year ago

For persuade, urge, inform, and promise, there are 17 in EWT with an obj and no ccomp—many of these have an xcomp

Yeah, I was only looking for no ccomp AND no xcomp above. I understood the problem to be cases where there's just one nominal. I think we all agree that obj+xcomp is fine and should not be ruled out (regular causatives, etc.)

Are you saying obj+ccomp is important to prevent for theoretical reasons, or for practical reason

Ultimately both: I think an obj pronoun can stand for a clause, so limitations on obj should also apply to ccomp, at least in English.

At the end of the day, I think we're going to need a verb lexicon in the validator to check

We could do this pretty easily if we initially limit it to verbs which are actually attested with ccomp in the data (granted we could miss rare cases). I'm not especially worried about there being 6-ish tricky cases in a corpus the size of GUM, as long as we're getting the much more frequent case of 'tell someone/iobj' covered. If the consequence of that consistency is that 'I persuaded him to eat' will have to have iobj, I can live with that, and maybe it's theoretically more correct as well (iobj can be considered to imply structure sharing, and I think while presenting that paper Dag Haug actually complained that current English edeps don't produce an xsubj edge for iobj cases in GUM)

nschneid commented 1 year ago

Ultimately both: I think an obj pronoun can stand for a clause

A pronoun can be anaphoric to a clause, but not always as obj:

I suspect the more general statement is: WHEN a verb allows the alternation between object+ccomp and double-object, the semantics of the ccomp can be repackaged as a nominal obj. But it doesn't follow that obj+ccomp needs to be banned for verbs like persuade which do not license double-object in the first place. (As I said before, it is easy to prohibit obj+ccomp on a verb-specific basis to address the tell cases.)

Now, moving to non-ditransitives for a moment, it is true that obj often alternates with ccomp:

But we also see cases where this breaks down:

So, my contention is that these alternations vary by verb. I just worry that a blanket constraint forbidding obj+ccomp because ccomp often alternates with obj would be overzealous and cause unnecessary confusion, when we don't really need it to block the tell errors.

nschneid commented 1 year ago

To see if we're on the same page, here's an attempt to lay out the two options we're discussing: https://docs.google.com/document/d/1W3nbpJOh8qdpqvtNJAzotgEdUIQfgUc-W5yJQKphsIE/edit (world-readable doc)

jnivre commented 1 year ago

Thanks, Nathan. Nice summary. Among other things, this makes it clear to me that we will have to allow iobj + xcomp in some cases no matter what, which reduces the resistance I initially had towards Approach A. However, I do think both approaches have some merit, so it would be interesting to hear if there are additional arguments in favor of one or the other.

nschneid commented 1 year ago

Looking at VerbNet classes with Recipients, I suppose we also need to consider that the double object construction is not just for typical transfer/communication events and benefactives but also feed and bill, charge, pay, cost, owe, wager, etc.:

And "bet" is an interesting case: "I bet you $100 that I'm right." Should that be an allowable case of iobj+obj+ccomp?

amir-zeldes commented 1 year ago

Thanks for the great write-up! Overall I prefer approach A, because I think parity of ccomp and obj is a substantial win.

And yes, to be consistent, the guidelines should also cover double object verbs, and 'feed' fits the pattern despite not belonging to the semantic classes of communication verbs or verbs of giving. The alternation behavior is the same, with 'feed oatmeal to the baby' being possible, so iobj applies.

As for bet, I'm less sure there, since I don't think it's normal to say "??I bet $100 to you". I see the interrogative test with what, bu I think it might be a different construction all the same based on passivization. In isolation I think the sum can be passivized ("$100 were betted on each horse" is OK right?), but not with the second object ("???$100 were betted me"), which normal ditransitives can do ("$100 were given me for..." is bookish but possible, based on a quick search). So maybe we could say when the sum and recipient of the bet are present, the sum is perhaps obl:npmod.

In any case I think we should not lose sight of the fact that quadrivalent verbs in problematic configuration where it is less obvious what to do are very rare overall. So implementing this proposal, at least for English, will make great improvements to many common constructions, even if some corner cases are still tricky, which might just be inevitable.

nschneid commented 1 year ago

My guess is that bet is licensed with the double object by analogy to other verbs that involve payment, but the prepositional dative doesn't work because you bet against someone, not to them. (Reflecting the antagonistic element of bet, which differs from trade, for example.)

The passive: