UniversalDependencies / UD_English-GUM

Other
32 stars 4 forks source link

headless relative clauses vs indirect interrogative clauses #37

Closed sylvainkahane closed 2 years ago

sylvainkahane commented 3 years ago

There is a lot of confusion between headless relative clauses and indirect interrogative clauses. Most of the following examples are analyzed as interrogative clauses (the wh-word is PronType=Int) are headless relative clauses in fact: http://match.grew.fr/?corpus=UD_English-GUM@2.8&custom=6123c797a52e9&clustering=e.label

In this example the same construction (what X think we'd hire) is used twice and is even analyzed once as a relative clause and once as an interrogative clause: http://match.grew.fr/?corpus=UD_English-GUM@2.8&custom=6123c8b1be814

nschneid commented 3 years ago

I get confused about this stuff too. Would love some documentation! UniversalDependencies/docs#454

sylvainkahane commented 3 years ago

Headless relative clauses are NPs and have other NPs in their paradigm:

What he did is strange His behavior is strange *Whether he will come is strange

While indirect interrogative clause have their own paradigm:

I wonder what he did I wonder whether he will come *I wonder his behavior

But there are indeed cases where it is difficult or impossible to decide:

I don't understand what he did I don't understand whether he will come I don't understand his behavior

amir-zeldes commented 3 years ago

Thanks for reporting! Yes, most of the errors in the search you linked are due to the fact that student annotators also find this tricky and confusing, and some have non-dependency theoretical syntax background that favors non-free relative analyses à la Bresnan. Because we have so many annotators, the risk of divergence with less intuitive constructions is higher than I'd like it to be...

For me the hallmark of the free relative cases is the idea that they saturate two roles (matrix and subordinate), which could hypothetically be divided into a matrix NP and a correlate WH NP in the subordinate clause (or at least this is how I explain it in class when people get confused):

I think "I don't understand his behavior" doesn't pass this test, but "I don't understand what he did" (=that which he did) does pass it. The "whether" cases are borderline but highly questionable for me (??I understand the fact whether he will come), so by convention "whether" is not supposed to get the free relative analysis in GUM (and likewise ccomp "if").

I'll keep this open until I find some time to wade through the actual search examples, but there are certainly multiple errors in there.

amir-zeldes commented 3 years ago

OK, these trees should be fixed upstream and will propagate for UD 2.9. As for the morphological tagging as "PronType=Int", some of the reported speech ones that are not free relatives will still be "Int", since this seems to match EWT behavior, and they are not really "Rel" if they are not free relatives (dominating acl:relcl). But the ones with subordinate clauses will now be Rel, as with all free relatives.

Thanks again for catching these!