UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
199 stars 42 forks source link

Comparative correlatives are inconsistent between EWT and GUM #395

Open nschneid opened 1 year ago

nschneid commented 1 year ago

(The X-er, the Y-er.) Are they relative clauses?

amir-zeldes commented 1 year ago

I think that's a standard analysis, since you can drop anything except the comparatives (so they are the heads), and you can insert "that":

nschneid commented 1 year ago

CGEL p. 1135:

image

§4.4.2 says that this use of "the" is special, a modifier rather than determiner.

I don't see any mention of relative clauses here. Earlier in the chapter it discusses comparative clauses, most commonly introduced with than or as. In He becomes more cynical the older he gets, the older [he gets] is reminiscent of as old [as he gets]. There is gapped material (as there is in relative clauses), but I don't think you can use "which" here.

I am thinking:

[The more sanctions bite]adjunct, the worse the violence becomes root(worse) det(worse, the) advcl(worse, becomes) -- complement in an adverb phrase advcl(worse, more) -- truly adverbial (a condition) det(more, the) advcl(more, bite) -- complement in an adverb phrase

Strict adherence to CGEL would argue against det, but advmod seems like a stretch.

nschneid commented 1 year ago

An argument could be made that that-clauses should always be ccomp rather than advcl. In which case the "complement in adverb phrase" deprels should be ccomp.

However, there are currently that-clauses attaching as advcl in degree/sufficiency/excess constructions (EWT, GUM). We lack guidelines on such constructions: UniversalDependencies/docs#672

nschneid commented 1 year ago

(BTW, PTB guidelines give up on this construction: they have a miscellaneous category X for constituents like "the sooner" and explicitly say that these are too rare and complicated to bother figuring out a real analysis for.)

nschneid commented 1 year ago

Reading CGEL more closely:

"The subordinate clause has the comparative phrase in front position in both versions, whereas the head clause has it fronted only when the whole subordinate clause is fronted." In

the subordinate clause (the condition) is NOT fronted, hence the comparative phrase (worse) in the main clause is not fronted. Later (p. 1136):

"The subordinate clause in both versions of the correlative comparative construction belongs to the class of content clauses."

To take an example

Structurally, CGEL is saying that the subordinate clause is much like the interrogative content clause "what food you devoured" in "I wonder what food you devoured"—only what is fronted is a comparative phrase ("the more food") rather than a WH-phrase ("what food").

The corresponding UD analysis would be root(hurt) advmod(hurt, more) advcl(hurt, devour) obj(devour, food) amod(food, more) det(more, the)? or det(food, the)?

and in the first sentence, "the more (that) sanctions bite": advmod(bite, more) det?(more, the) mark(bite, that)

amir-zeldes commented 1 year ago

says that this use of "the" is special, a modifier rather than determiner.

If we consider the etymology and comparative evidence, this 'the' is not the definite article, but an eroded inflected case form of the demonstrative stem (from back when 'the-' was still a demonstrative), which was grammaticalized to introduce a subordinate clause. This is still visible in the German equivalent 'desto', which contains the Germanic d-/th- stem and is formally distinct from the article. Given all this evidence, I think the proper solution should have been to xpos it IN, and deprel mark, but since PTB tags it DT, maybe it's best to just leave it alone.

I don't see any mention of relative clauses here

It doesn't say it's not a relative clause either; all it says is that it's subordinate. The construction is really a sui generis, but there is definitely a line of work classifying it as most resembling a relative going back at least to Culicover & Jackendoff (1999), with arguments including:

  1. This is the sort of problem which_i the sooner you solve t_i, the more easily you’ll satisfy the folks up at corporate headquarters
  2. They failed to tell me which problem the sooner I solve t, the quicker the folks up at corporate headquarters will get off my back".

In any case, because the item being modified is adjectival or adverbial, even in a relative analysis you will get advcl as the main relation; I don't think it's an adverbial clause per se in linguistic terms, but in UD terms the best label is probably advcl:relcl IMO. That said, C&J also end up saying it is basically a totally unique construction, which is more or less what PTB says in the end.

nschneid commented 1 year ago

CGEL says the subordinate clause is a content clause, which excludes it from being a relative clause.

Those two C&J examples are totally ungrammatical for me. But speakers may differ.

amir-zeldes commented 1 year ago

I'm not sure content clause makes much sense here (and this is also not what C&J conclude), since, apart from the island issues which are conspicuous, comparatives aren't really supposed to have 'content' at all. They characterize a predication which selects for them, but is optional (this is like the way adjectival modifiers are described in HPSG, where they are adjuncts but have a MOD slot for the thing they modify). Notice also that, unlike content clauses, the clause cannot be pronominalized:

But:

This is not just a correlative issue - no version of "it" (both clauses or either one) works here. It's also not possible to introduce the clause with 'whether' as in a content clause:

speakers may differ

Yeah, C&J also disagree in the paper itself and note for several examples that one of them rejects and the other accepts them.

At the end of the day I put more stock in their analysis than CGEL's, not only because it is a research paper devoted completely to this construction, but also because I think their conclusion is more honest: at the end of the day, CCs are a totally idiosyncratic construction. But if we want to use one of the existing labels for it, I think it's probably closest to a relative for the reasons they outline, and also semantically this makes more sense than assigning 'content' as the subordinate category. For me the relative analysis keeps parity nested acl equivalents like "the faster the speed at which you drive...", so basically if we skip the nested nominal modification, we get a sort of haplology, where advcl:relcl+acl:relcl is replaced with just the first in "the faster you drive".

nschneid commented 1 year ago

Here's a paper from 2011—seems like this construction is still up for debate: https://tcue.repo.nii.ac.jp/?action=repository_action_common_download&item_id=614&item_no=1&attribute_id=21&file_no=1

amir-zeldes commented 1 year ago

Very nice, thanks! It looks like they're mostly interested in the relationship between the two parts, but about the internal structure they concur that:

"we argue (based on Iwasaki (2010b)) that the inner CP is a relative clause"

So it looks like that would be more support for advcl:relcl, right?