UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
270 stars 245 forks source link

Markers of the standard of comparison (and other uses of as/how/like-type items) #767

Closed josefkr closed 3 years ago

josefkr commented 3 years ago

Hi all,

I would like to solicit feedback on how certain items in German comparisons should be POS-tagged in accordance with UD. The main item I have in mind is wie when it is used to mark the standard of comparison. (It is the default marker of the standard in German comparisons of equality and a non-standard/regional marker in comparisons of inequality, for which the standard marker is als 'than'.)

I am mentioning some other uses of wie below since, in looking at instances of wie, I also found inconsistencies/pecularities in these other uses. I am only contrasting German and English treebanks here but would of course be interested in hearing about how relevant cases have been annotated in further languages.

Josef


Comparisons

In the first subset of comparative uses wie marks NPs. Here, wie seems to be used in a preposition-like manner.

  1. Sie wirkten wie frisch Verliebte. 'They looked like new lovers.'
  2. Er behandelt mich wie einen Bettler. 'He treats me like a beggar.'
  3. Die Gemeinde wird wie eine Steuerbehörde verwaltet. 'The city is administered like a tax authority.
  4. Mein neues Büro ist halb so groß wie mein altes. 'My new office is half as big as my old one.'
  5. Mein “kleiner” Bruder ist 4 Jahre jünger als ich und ist 2 Köpfe größer wie ich. ' My little brother is 4 years younger than I and 2 heads taller than I.'

There are also comparative uses where full clauses are introduced by wie:

  1. Ich rannte so schnell, wie ich konnte. 'I ran as fast as I could.'
  2. Polster: Nicht mehr so groß, wie er früher war. 'Polster: No longer as big as he used to be.'

And cases where what follows wie is just an adjective or PP.

  1. Außerdem sollte die Tätigkeit so vielseitig wie möglich gestaltet werden. 'In addition, the activity should be designed to be as multifaceted as possible.'

Now, the UD version of the HDT treebank treats many /most of the above 3 sets of comparative cases as CCONJ, which to me seems like adopting an ellipsis analysis for the non-clausal cases. In some cases, wie is marked as ADP. By contrast, in the Tueba D/Z UD treebank one mostly finds the POS tag ADP.

For perspective: in English, GUM and EWT never use CCONJ for as or like in comparable comparisons.

While I'm on the topic of comparisons let us also consider German als (and English than) in comparisons of inequality. The English EWT treebank uses SCONJ for than when it governs a clause as in

  1. I said Yep, had that truck longer than I have had most of my women.

It uses ADP when it governs an NP:

  1. Preserves a rate differential between the Redwood and Baja paths, although somewhat less than the current differential.

In the English GUM treebank things seem largely similar.

In German, CCONJ is used for als 'than' in the HDT treebank for these cases. In the Tueba treebank we find ADP (with NPs) and SCONJ (with non-NPs).

So the main issue here is: what arguments might there be for CCONJ over SCONJ/ADP? Are there treebanks/languages that do things in entirely different ways?

Introducing examples of a category

Apart from the above comparion cases, there are uses where wie introduces an example of a class, sort of like such as or like in English.

HDT uses CCONJ for these cases:

  1. Unter den Spielen befinden sich Titel wie MechWarrior 4 , Combat Flight Simulator II , Crimson Skies und Age of Empires II .
    'The games include titles {such as/like} MechWarrior 4, Combat Flight Simulator II, Crimson Skies, and Age of Empires II.'

In Tueba these instances of wie are tagged ADP.

In the English GUM treebank, the as in such as and like are tagged as ADP:

  1. He then appeared in a string of supporting roles in films like Woody Allen's Match Point (2005) ...

I would be inclined to use ADP for the German cases, too.

Conjunction-like

In some uses, wie is often felt to be used like a conjunction. Tueba treats these as CCONJ (whereas in comparisons it usally has ADP for wie with a noun phrase). These uses are relatively rare and I haven't tried to find instances in HDT since that treebank uses CCONJ for wie in comparisons, which far outnumber this type..

  1. Am späteren Gewinn mit dem Wein würden sowohl die Obststifter, die Arbeitslosen wie er als Unternehmer beteiligt. 'The fruit donors, the unemployed and him as the entrepeneur as well would share in the later profit from the wine'
  2. ICH trage Verantwortung, für das Gute wie das Schlechte. 'I am responsible for both the good and the bad.'
  3. Das ist so wahr wie tragisch zugleich. 'That is as true as (it is) tragic at the same time.'
  4. Insbesondere weiß ich die zwei Einschübe zu schätzen, die ihren Inhalt bei zugeklapptem Portemonnaie so zuverlässig wie simpel schützen. 'In particular I appreciate the two insets which protect their contents as reliably as simply when the wallet is closed.'

The first example of the above set combines wie with sowohl , which is most commonly used as part of "sowohl ... als auch " (both X and Y), which is treated as CCONJ. (Incidentally, in X as well as Y the second as is always treated as ADP so as well as is different from both ... and in English.) The second example could also be translated to English as 'for both good and bad alike' but of course in German there's only the wie and nothing like an and. For the last two examples I can still perceive a comparative degree semantics and so wouldn't use CCONJ here.

An easy policy would be to stick with a comparison-treatment except for the somewhat funky cases like the first one where wie co-occurs with something that normally is part of two paired CCONJs.

Comparative and relative clauses

The CGEL grammar recognizes a class of comparative adjunct clauses, which are characterized by the fact that in the as-clause there usually is a verbal complement missing. (First three examples from EWT, last two from CGEL grammar.)

  1. We even arrived 10 minutes early as the website suggests _.
  2. Is he on his last legs or is it conceivable that he could be around for another 15 years as _ is claimed by some of the sources ive been looking at? (no subject)
  3. Also, as I mentioned _ to you, we have faxed your mark-up to Melanie Gray at Weil Gotschal in Houston to keep her informed. (no object of mentioned)
  4. [As I have already observed _,] no reason has yet been offered for this change.
  5. He phoned home every day, [as he'd promised to do _].

In EWT these as-clauses are treated as advcl and as is an SCONJ.

German has similar clauses. Below the first example is treated as an advcl with wie as SCONJ. In the second example, the wie-clause is treated as parataxis with wie as SCONJ. In the third and fourth, the wie-clause is treated as acl and wie as SCONJ. (All examples from Tueba).

  1. Es ist nämlich alles kein Problem, wie Leo immer wieder sagt. 'It's not a problem at all, as Leo keeps saying.'
  2. Der Angriff war "ein Versehen", wie sie heute sagt, "ich habe gerade für einen Kollegen die Urlaubsvertretung gemacht".
    "The attack was a an accident", as she says, "I was standing in for a colleague on vacation."
  3. "Wie ein Tier", wie er selber sagt.
    "Like an animal", as he says.

I would lean towards SCONJ as POS and advcl or parataxis for the syntactic dependency.

Finally, German wie also occurs in what one might call relative clauses. Note that no core argument is missing here. Also, in these examples the wie-clauses only modfiy the preceding nouns; you couldn't move the clauses away from where they are, at least in my opinion, unlike with the wie-clauses from the preceding set.

Tueba treats these instances of wie as SCONJ. In the following example, the wie-clause has an acl-relation to its head.

  1. Die Stadt, wie wir sie kennen, sie stirbt. 'The City as we know it, it is dying'

HDT treats the wie in the following example as an ADV in an acl-clause.

  1. Urheberrecht , wie wir es kennen , gibt es in manchen Ländern gar nicht . 'Copyright as we know it, does not exist at all in some countries.'

I didn't find good counterpart instances in the English treebanks but things like 'Could these changes lead to the inevitable death of the city as we know it?' do of course exist. This structure may be very much tied to the use of 'kennen/know'.

Both English and German have elliptical sentences consisting of a noun and an as/wie-clause where the verbs seem to vary more:

  1. Kirschen, wie wir sie lieben. 'Cherries {as|the way} we like them'
  2. Italy, as we love it
  3. History as We Lived It

I am not sure what to do about these.

Interrogative

Next, there are uses where wie is used an interrogative adverb (for manner, means or degree). Accordingly I would tag it as ADV in all the uses illustrated in the next few sentences, i.e. in both main and embedded questions.

  1. Wie viele Unternehmen gibt es in Sachsen? 'How many companies are there in Saxony?'
  2. Wie verändert das Internet die Welt? 'How does the internet change the world?'
  3. Man wollte wissen, wie sich Fernsehen auf Kinder auswirkt. 'They wanted to know how television impacts children.'

Note: the ADV analysis for the embedded uses is found in the HDT and GSD UD treebanks for German. In the German Tueba D/Z treebank, the embedded questions were treated as SCONJ. For (minimal) comparative perspective: in the English GUM treebank, the adverbials of embedded questions (e.g. how, why) seem to be treated as SCONJ whereas in EWT they seem to be ADV.

I am inclined to stick with ADV for embedded interrogative adverbs like warum/why and wie/how. If somebody thinks this would be an error, please explain.

Temporal subordinator

Another use of German wie is as a temporal subordinator. These should be SCONJ, I take it. The German treebanks agree on this and I think similar uses of English as are also labeled with SCONJ.

  1. Wie sie die Tür öffnete, stürzten ihr die beiden Katzen entgegen. 'As she opend the door, the two cats rushed towards her.'
  2. Wir hörten, wie nebenan gestritten wurde. 'We heard/listened as/how they were fighting next door'
  3. If you want a wharf berthage then call the local harbour master as you near the Chathams and he will sort you out. (GUM)
dan-zeman commented 3 years ago

I would use SCONJ for all occurrences where wie marks a standard of comparison. Regardless of whether the standard of comparison is a clause or a nominal (but these two types will be distinguished in the relation between the standard of comparison and the compared adjective/adverb, following the UD guidelines: obl for nominals and advcl for clauses). I would not use ADP for wie with nominals.

I don't think there is a reason to use CCONJ.

Stormur commented 3 years ago

I agree with @dan-zeman, but I would actually use always advcl for coherence, acknowledging systematic ellipsis. I think it is clearly shown that this is the case here (and in other languages), since even when just one nominal follows wie, it always has the same grammatical case as the element with which the comparison is drawn, implying that there is an underlying clausal structure parallel to the main one, reduced to minimum terms so as to avoid repetition as much as possible. A truly nominal comparison would sound something like Ich habe diesem Buch ein besseres gelesen. By the way, I admit that I have always found that part of the guidelines quite confusing...

I agree also with such elements not being CCONJ, this annotation just seems to stem from a semantic interpretation of comparison of equality, but syntactically the comparative block always appears to be subordinate.

nschneid commented 3 years ago

A reminder that comparative constructions are addressed here: https://universaldependencies.org/workgroups/comparatives.html

josefkr commented 3 years ago

Thank you for the comments and thanks, Nathan, for the pointer.

So, I can see the basic criteria for how to mark the components of a scalar comparison. I am curious though with respect to some aspects of the analysis that the working group's page presents. This relates to Stormur's comment re better always using "advcl" as the relation of the standard to its head.

Right now, the group's report says that when the head of the explicit material in the standard of comparison is a noun, the relation to be used is "obl". E.g. talent has rel "obl" to important in this example:

(i) as important [as SCONJ] as a player's talent

By contrast, in (ii) and (iii) the rel of sober and ever to better and pinker is "advcl".

(ii) He plays better drunk than sober (iii) Your hair is pinker than ever

Now, in (i) than is treated as "case" whereas it is "mark" in (ii) and (iii). This choice is commented on as follows:

" We err on the side of minimizing the postulation of unobserved structure and opt to treat these cases as just an oblique nominal complement. In consequence, the subordinating conjunction is attached as case rather than mark:"

But doesn't that statement conflict with


Another question I have pertains to how closely the working group follows the CGEL grammar. Huddleston and Pullum say that while for most cases there are arguments for both a reduced clause and a simple complement treatment, there are also other cases where the reduced clause analysis is not plausible:

"One initial point to make is that there are unquestionably some constructions where a single element following than/as is an immediate complement, not a reduced clause." They give examples including:

  1. I saw him as recently as Monday.
  2. It is longer than a foot.
  3. He's inviting more people than just us.
  4. He's poorer than poor.

For UD purposes, should cases like (1)-(4) then be treated as non-clausal following H&P, or should they, for uniformity's sake, nevertheless be assimilated to the examples in (i)--(iii), with "SCONJ" as the POS for as and than and rel "advcl" for (4) and potentially also for (1)--(3)?


Finally, the working group page doesn't discuss non-scalar comparisons. As I read H&P on these cases (§5.3 in Ch 13), they do here distinguish clausal cases from phrasal cases:

"In the majority of cases (unlike those cited for scalar such in [28] of §4.3) it is not possible to add a verb, and the NP following as is best regarded as an immediate complement, not a reduced clause (§2.2). "

(a) Would you yourself follow such advice as you give me _ ? (CGEL) (b) To a man such as he is, this was hard to stomach. (c) The Night's Watch was the place where broken men like he is now were once sent to live a life of servitude in exchange for their ills (d) The choice depends on such factors as costs and projected life expectancy. (CGEL) (e) The world needs people like you.

Would people be prepared to follow CGEL on this? Or would you want to treat non-scalar comparisons the same as scalar comparisons and thus e.g, use "SCONJ" as the POS-tag for like and as in every case ?

Also what would be the heads of the as/like- clauses /phrases with non-scalar comparisons? (For scalar comparisons, the choice was to treat the modified adjective/adverb as the head. I.e. in "Martin is more intelligent than Donald", Donald depends on intelligent, not than. ) If it is the nouns, then the rel should be "acl", presumably?

dan-zeman commented 3 years ago

(e) The world needs people like you.

nmod(people, you) case(you, like)

(b) To a man such as he is, this was hard to stomach.

I am undecided as to whether man or such should be the head. Also such as could be a fixed multi-word expression, and it could be the predicate of a copular adnominal clause:

acl(man, such) fixed(such, as) nsubj(such, he) cop(such, is)

Stormur commented 3 years ago

Given the occasion, I will try to summarize my two cents about the issue here, something which I have already been planning to do since a while... please pardon the usual length.


(e) The world needs people like you.

nmod(people, you) case(you, like)

I admit that I still cannot wrap my head around this solution. As @josefkr noticed, it is true that the official treatment of comparatives shows some contradictions as regards the alternation case/advcl and the choice of a head.

In this case in particular, could we not consider the ellipsis of a copula?

(e) The world needs people like you [are].

This might be substantiated by some languages where that you keeps a nominative (or equivalent) case. The same goes for the other examples, where the "unquestionability" that "some constructions where a single element following than/as is an immediate complement, not a reduced clause" sounds a bit hasty to me:

  1. I saw him as recently as Monday [is].
  2. It is longer than a foot [is long].
  3. He's inviting more people than just us. (-> than just we are)
  4. He's poorer than poor [is].

The third one is more problematic beacause of the pronoun form, but in general, in these constructions the omitted element might be seen either as a copula or as the repeated predicate ("is long"), and both are easily "cut". More importantly, there is always the potential for a full clause without changing syntax as in I saw him as recently as Monday can be (I hope this is grammatical, but I think the point stands as many similar examples can be found). A similar alternation with as and like:

(By the way, in sentence 1 can we see a mismatch between the adverbial form recently and Monday? This would confirm the clausal nature of the comparative, i.e., does as recently as on Monday work the same way?)

So, I see this as a point to always annotate such comparatives as clauses. In some cases we will have advcl, in others (like (e), 1, 3) acl tied to a non-verbal element (people, recent[ly], people), but the treatment would be uniform as clauses and clear overall, with no hardly understandable distinctions such as

  1. He plays better drunk than sober -> advcl
  2. I like red wine more than white -> acl??

where the second one is an advcl too, tied to like , since its underlying form is clearly I like red wine more than I like white wine (and in Czech, white keeps the accusative case of red in the main clause). We can compare this to truly nominal comparatives, to which I can add Latin

where melle '(than) honey' is in the ablative, the Latin oblique case "par excellence" and different from the nominative of oratio 'speech', and is not introduced by any other element (by the way, this is an unusual construction in Latin. Incidentally, also Mongolian makes use of this strategy, but without explicitly marking the degree: you tall from me = you are taller than me).

On the other hand, if one decides to treat some comparatives as nominals, by means of case, why not simply consider as/than/like ADPs in those contexts? Keeping SCONJ still does not remove the disparity of treatment between advcl and obl.


(b) To a man such as he is, this was hard to stomach.

I am undecided as to whether man or such should be the head. Also such as could be a fixed multi-word expression, and it could be the predicate of a copular adnominal clause:

acl(man, such) fixed(such, as) nsubj(such, he) cop(such, is)

I think, independently from the status of such as, that the better head here is man.

In general, contrary to what the guidelines seem to suggest (they are not so clear on this point), I think that the head of a comparative is better seen to be the head of the graded/contrasted/compared/... phrase, not the grading element itself.

For example, starting from the consideration that in

flour is the head of its phrase X, which will then be part of an as X as Y construction, it then seems quite awkward to choose (as) much as the head of the as-comparative in

I think this can be solved simply by seeing that as the recipe... is an acl to flour, which is graded by as much. This is also much more consistent with all the other cases where the head is an adjective. Also, in

I would suggest as I previously... to be advcl of hear, not as often, which "grades" the predicate.

Such choices would also avoid systematic discontinuities or non projectivities in comparative constructions, which do not seem really justified seeing how common they occur with an nnotation based on the adverbial element.



TL;DR - In my opinion, I think that: