UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
199 stars 42 forks source link

Spurious VBG xcomps #410

Open nschneid opened 1 year ago

nschneid commented 1 year ago

Many of these should be ccomp or advcl:

Some are borderline—would be good to have examples in the xcomp guidelines.

amir-zeldes commented 1 year ago

Some of these are definitely errors, but I don't know what to fix them to without resolving amir-zeldes/gum#104 , so I will wait with this until the next meeting

nschneid commented 1 year ago

Today's decision: we like xcomp for the 3 examples in the bullet points above

nschneid commented 1 year ago
nschneid commented 1 year ago

@dan-zeman perhaps you could clarify this language from the xcomp page:

Pro-drop languages have clauses where the subject is not present as a separate word, yet it is inherently present (and often deducible from the form of the verb) and it does not depend on arguments from a higher clause. Thus in neither of the following two Czech examples is there any overt subject, yet only the second example contains an xcomp.

This is followed by 2 examples. How does "I-have" attach in these examples? If it's an aux within the subordinate clause in both cases, I guess I'm confused about why the 2nd is xcomp.

If in the 2nd example "I-have" is within the matrix clause, then is the point that the subjecthood is sufficiently concrete to consider it an xcomp control construction, even though the matrix has no nsubj per se? Is any "understood" subject of the matrix clause fair game for controlling an xcomp (as in English imperatives etc. above), or is this limited to pro-drop where it's explicit via inflection of a verb/aux?

dan-zeman commented 1 year ago

This is followed by 2 examples. How does "I-have" attach in these examples? If it's an aux within the subordinate clause in both cases, I guess I'm confused about why the 2nd is xcomp.

It is aux(slíbil, jsem) in both examples. But in example 11, it is the predicate of the subordinate clause, while in example 12 it is the predicate of the main clause. The subject of slíbil jsem is (Person=1, Number=Sing), and it is also the subject of psát in example 12 and of píšu in example 11. But in 12 the two subjects are mandatorily coreferential (required by the main verb slíbil), while in 11 it is coincidence. You could say I write because my wife promised it but you could not interpret 12 as I promised that someone else will write. Example 12 could also be modified to the 3rd person and you could have Slíbil psát "He promised to write" (= he promised that he will write), or Martin slíbil psát "Martin promised to write".

nschneid commented 1 year ago

(clarified in the docs)

Any preferences regarding English? After thinking it over, my gut feeling is that xcomp should be OK if the matrix subject is implicit, even though it's not reflected morphologically in English like it would be for a pro-drop language. However, I'm hesitant to include implicit recipient arguments like with "recommend"/"suggest"/"advise".

dan-zeman commented 1 year ago

Thanks for the clarification in the docs.

I am hesitant with implicit recipient arguments too, but again it is mainly based on Czech (for English, I think native speakers should voice their preference, and especially Stanford people / LFG people who came up with the concept of xcomp). In Czech, you can have subjects coreferential with dative arguments of the main verb (doporučil jsem mu psát "I recommended him to write"), those constructions would have xcomp. But if the dative argument is omitted (doporučil jsem psát "I recommended to write"), I'm less sure it should stay xcomp. The dative argument is really generic and missing, it's not observable in the verbal morphology like the nominative argument (subject).

amir-zeldes commented 1 year ago

But if the dative argument is omitted (doporučil jsem psát "I recommended to write"), I'm less sure it should stay xcomp.

I think it should remain xcomp, based on a general principle that ellipsis should not change remaining structures. I think the interpretation and form remains the same as when the dative participant is present. It's true the the person being recommended to is not mentioned, but they are implicit (recommending implies a recommendee), and whoever that is, they are also the implicit subject of writing.

nschneid commented 1 year ago

What if we know the suggestion is about a third party? Would the following be natural:

If B's answer requires that the suggestee be the one going camping, then that would be consistent with xcomp. @amir-zeldes do you think the third-party reading is possible? I think I would prefer "I suggest sending them on a camping trip", but I'm not sure "I suggest going" is ruled out.

amir-zeldes commented 1 year ago

Sorry for the slow reply - yes, I think I would accept B above. Not sure how we can tell if the intention is "I suggest to YOU that THEY should go" or "I suggest to THEM (through you) that THEY should go" - in the second case, it goes back to the elliptical reading I alluded to above. But I think if someone told me that in this exceptional case they went with ccomp because of non-sharing of the participants, I guess I would be OK with that too.

nschneid commented 1 year ago

So that's the tricky thing—xcomp doesn't just mean the two predicates' arguments are interpreted in the sentence to be the same, it means the construction mandates that they be the same. But it's hard to diagnose that sometimes...maybe we should just say, if in this construction ("suggest X-ing") they are normally or by default interpreted as the same (the suggestee = the X-er), then it's xcomp.

amir-zeldes commented 1 year ago

it means the construction mandates that they be the same

I'm not sure if it's just a nuance, but there are also adverbial purpose infinitives where that is the expected interpretation (I hesitate to use 'mandate' for anything). For example in "I did it to make money", we usually annotate "make" as advcl, but there is pretty much no other reading except for PRO==I.

For me the mandates part of xcomp is that the clause itself is mandates, as in "I used to go", where we more or less have to say "to go", otherwise we lose that reading of "used". The fact that the arguments are shared is de facto so, but I'm not sure if we said there was an exceptional circumstance when they are not, that that would invalidate annotating it as xcomp when a. the clause is obligatory (so not advcl) and b. the arguments are shared (so xcomp, not ccomp).

nschneid commented 1 year ago

I don't think we want to define xcomp as any complement clause where the overt or implied subject is understood to be coreferent with a matrix argument. One reason is pro-drop as discussed here. The sharing should be specifically licensed/expected by the matrix predicate (as used in that construction). For English I can't think of a case with a complement clause where the coreference is very clear but not necessarily preferred by the predicate. For "suggest"-type verbs I think there's room for debate about whether implicit oblique arguments are fair game but I'm fine with including them under xcomp.

amir-zeldes commented 1 year ago

I looked at the example but I'm not sure I understand how that relates - that pro-drop case is not a complement clause at all (it's a causal adjunct clause). I think what I wrote above would work in a pro-drop language, at least in something like Polish which I take to be similar to Czech in this respect. In any case, I'm not sure there is a concrete case we disagree on, unless you're saying that the unusual reading of "I suggest going camping" means that "I suggest to you to go camping" should not be xcomp by association.

nschneid commented 1 year ago

I looked at the example but I'm not sure I understand how that relates - that pro-drop case is not a complement clause at all (it's a causal adjunct clause).

True—@dan-zeman is it worth adding a pro-drop example that is ccomp and not xcomp because the coreference is coincidental?

For English, let's go with "suggest X-ing" and so on as xcomp in their normal interpretations.

dan-zeman commented 1 year ago

I looked at the example but I'm not sure I understand how that relates - that pro-drop case is not a complement clause at all (it's a causal adjunct clause).

True—@dan-zeman is it worth adding a pro-drop example that is ccomp and not xcomp because the coreference is coincidental?

Sure! Added in https://github.com/UniversalDependencies/docs/commit/be9d958d914468c732de8210bac8b0333f42af8b. (Context: I think that the first example was added after someone asked in a discussion whether a missing subject is a sufficient condition to warrant xcomp – so it was not primarily about xcomp vs. ccomp, but about xcomp vs. anything.)