UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
201 stars 42 forks source link

Clefts: implementation #359

Closed nschneid closed 2 years ago

nschneid commented 2 years ago

New policy on clefts involves advcl:relcl, which needs implementation.

Usually it is clear from reading an "it" + relative clause how to interpret it. However, potentially clefted adverbial phrases are tricky to distinguish from extraposition.

I think these are clefts because the circumstantial phrase seems to be in focus:

I think this one is extraposition:

See also #176, it-extraposition

amir-zeldes commented 2 years ago

My problem with the PP cases is that they don't actually involve a relative clause. I could see doing all sorts of analyses here, but these are simply not interchangeable with a clear "which" clause, which is possible for core arguments like "It was Bob who(m)/that/which I saw". An extraposition analysis for the PP cases (even if the non-extraposed variant would sound quite weird) has the advantage of not dragging in a relative clause (which these really aren't IMO) and maintaining the normal argument structure plus expletive status of 'it'. For the clear relativizer cases, I think relcl is fine.

nschneid commented 2 years ago

My problem with the PP cases is that they don't actually involve a relative clause.

Syntacticians say that while relative clauses typically have an extracted nominal, some relative clauses have an extracted adverbial phrase like an adverb or PP. (One example is in free relatives with an adverb: "where you live".) As I see it, the way to make an it-cleft out of "Indians made conciliatory gestures toward Islamabad on his nudging" that focuses on "on his nudging" is to extract it, leaving an relative clause in the it-cleft: "It was on his nudging that Indians made conciliatory gestures toward Islamabad."

Some of these may be ambiguous for it-cleft vs. extraposition, but I have gone with the one I deem more likely.

nschneid commented 2 years ago

BTW, the cleft sentence could have focused on a different PP: "It was toward Islamabad that Indians made conciliatory gestures on his nudging".

This PP extraction may be specific to clefts, but it meets the extracted-element criterion for defining a relative clause.

nschneid commented 2 years ago

The extraction may be more obvious if it is an obligatory PP complement: "It was under the bridge that the troll lived." "It is to the river that this path leads."

amir-zeldes commented 2 years ago

It was on his nudging that Indians made conciliatory gestures toward Islamabad

Right, so we agree this is extraposition, and indeed, you can't say "It was on his nudging which Indians made...". I think it's just a plain that complementizer (IN, not WDT).

It was toward Islamabad that Indians made conciliatory gestures on his nudging"

I don't think it matters which PP it is - either way, the "that" is the indeclinable English that, which resists case marking (i.e. a preposition)

It was under the bridge that the troll lived

Even in this obligatory one, "which" doesn't work. The "that" simply doesn't represent the PP inside the subordinate clause:

If you do advcl:relcl(bridge,lived) then the whole syntactic structure is misrepresented: there is no constituent "the bridge that the troll lived", and the sentence appears to be predicating that "it" was a particular bridge; in reality, the sentence is predicating about where the troll lived, that it was under the bridge.

nschneid commented 2 years ago

I'm saying it's not extraposition, it's extracted from a relative clause. Extraposition would be if the clause was delayed to put a lighter element first, but that's clearly not the case with "It was under the bridge and north of the meadow that he lived."

The it-cleft is definitely a weird construction, but the syntactic literature seems to agree that it builds on relative clauses in peculiar and restricted ways (e.g. no NP constituent is formed; PP extraction is allowed and not always compatible with "which").

I happen to agree in principle that "that" doesn't represent the PP, because I think relativizer "that" is never a pronoun, it is a subordinator/complementizer in all relative clauses (unlike "which", "who", etc.). But that is not UD's position. So we are forced to compromise.

amir-zeldes commented 2 years ago

I'm not sure what you mean by it not being UD's position - I thought it's that position that we're discussing ;)

If "that" is not a relativizer, then I don't see how the subordinate clause can be a relative clause. Maybe in other languages it could work that way, but for English these cases look and work the same way as postponed clause, both in terms of argument structure (what is being predicated) and in terms of morphosyntax (incompatibility with case marking, impossible to substitute with a relative pronoun). I don't understand the motivation to treat these cases as relative (by contrast to "It was Kim who ate the cookie", where it makes sense)

nschneid commented 2 years ago

In normal adnominal that-relatives it's UD's position that "that" is a PRON. I'm saying that may not be ideal, but we're stuck with it.

For whether the second part of an it-cleft is a relative clause or not, I'm just going by what the English syntacticians say. (Emailed you one textbook explanation.) My understanding is that this is a consensus position. But if there is other literature arguing it's extraposition, I'd be happy to take a look at that.

amir-zeldes commented 2 years ago

In normal adnominal that-relatives it's UD's position that "that" is a PRON

Oh, I totally agree with that position: it's interchangeable with "which" and supports case marking, so why not? It's also historically correct if you compare across Germanic (it is the standard, case-markable relativizer in German)

My understanding is that this is a consensus position

I think it's been debated and is not generally agreed upon, here's an example from a corpus study:

http://icame.uib.no/ij32/ij32_7_34.pdf

Examples from Quirk et al. (1985: 953) such as It was because he was ill (that) we decided to return, and It was in September (that) I first noticed it show clauses (that we decided to return, and that I first noticed it) which do not modify noun phrases, but rather entire clauses (because he was ill, and in September, respectively). Their argument is based on the lack of a noun as antecedent, an argument which Miller (1999) has argued against. As pointed out in Miller, “the lack of a noun antecedent does not automatically disqualify a sequence as a relative clause” (1999: 17), however, the impossibility of replacing that with a WH - word does.

I think the substitutability issue is the one that bothered you about the POS tag too, but the lack of case marking and stranding options also make these look syntactically more like a mandatory extraposition (notwithstanding the fact that, information-structurally, they are still clefts - just not using a relativizer to mark it, which many languages don't, e.g. Polish)

nschneid commented 2 years ago

What do you mean by saying "that" supports case marking? For me at least, it doesn't inflect for genitive: *the book that's cover is torn

amir-zeldes commented 2 years ago

Right, I guess possessive 's is the exception because it's not a preposition. But you can say "The book that I studied with", and the preposition indicates the correct oblique relation in a normal way. BTW in German you can say the equivalent of "that's":

nschneid commented 2 years ago

You can also say "the book I studied with", with no relativizer. So I don't think the preposition is really case-marking or evidence for pronominal status of "that". (Even though we treat it as such in UD because it's sort of semantically convenient.)

Interesting that das does receive case marking in German. But I am firmly in the camp that English is not German. :)

amir-zeldes commented 2 years ago

"the book I studied with", with no relativizer

In structural and generative terms, I think that's often considered a zero relativizer, so still 'normal' (-ish)

English is not German

That's definitely true - but if we can do something one of two ways, then cross-linguistic (and particularly within-family) comparability can be a factor, especially if we want language comparison and multilingual applications to be a strong use case for UD data.

nschneid commented 2 years ago

I think it's been debated and is not generally agreed upon, here's an example from a corpus study:

http://icame.uib.no/ij32/ij32_7_34.pdf

Ah yes, we found that paper in the course of our discussions (not sure if you were at that meeting).

If I'm reading correctly this paper is about tests to distinguish extrapositions vs. clefts. It's not claiming that extraposition is an appropriate analysis of some clefts. p. 13: "the present paper argues that the two constructions can be reliably distinguished from each other"

p. 11:

The exact status of the cleft clause has similarly provoked debate, with opinions ranging from those arguing strongly for its analysis as a relative clause (Hedberg 1990; Huddleston and Pullum 2002) to those still holding notable differences between relative clauses and the nature of cleft clauses 7 (Quirk and Greenbaum 1985; Miller 1996; Miller and Weinert 1996; Miller 1999; Biber et al. 1999). However, it suffices to say that most studies converge on the idea that cleft clauses are at least reminiscent of, even if not identical with, relative clauses.

So, cleft clauses are similar to relative clauses, but views differ as to whether they should be called relative clauses.

This suggests that advcl:cleft might be an alternative to advcl:relcl. (I know some treebanks for other languages use a :cleft subtype.) But there wasn't much appetite for this when we discussed it. (I personally could go either way on the label, but it would create an additional subtype, which I know we generally want to avoid.)

p. 18 addresses PPs:

it is not always the case that clefted constituents function as arguments of the cleft clause which they relate to. That is, in some cases, the cleft clause is not exactly a relative clause and the clefted constituent is rather an adjunct of the cleft clause. In such cases, the gap test does not hold in the same way as we have seen earlier, since the cleft clause is ‘complete’ without the ‘missing’ clefted constituent. Consider example (15a):

(15a) TS and i think i’ll be i’m sure i’ll get maturity onset diabetes KA it’s for the sugar that it has to secrete the insulin

The clefted constituent for the sugar is an adjunct of the cleft clause it has to secrete the insulin since it is optional and the clause is complete without it (admittedly, the cleft clause does allow the PP to be present, but it does not require it). Hence in examples such as (15a), the gap test may be considered a weaker and perhaps not entirely convincing means for establishing the desired cleft classification.

So, the author is saying this is a PP cleft, not an extraposition. But the PP is an adjunct to the clefted clause, so in her terms it is "not exactly a relative clause". She seems to be assuming that a relative clause has to be incomplete on its own, i.e., have an extracted argument (rather than adjunct). That stricter definition would mean that free relatives headed by "where", "how", etc. ("This is where I learned to play the banjo") do not properly contain a relative clause.

nschneid commented 2 years ago

The it-clefts are now easy to find: http://universal.grew.fr/?custom=6341ecaa73d91

Closing for now. We may want to reconsider the approach after comparing across languages, for example. With the it-clefts identified, any change should be straightforward to implement.