Closed martinpopel closed 9 years ago
Not sure if @mcdm et al. DepLing'13 is current, but guidelines for this should likely go to http://universaldependencies.github.io/docs/u/overview/specific-syntax.html .
I dont know for other languages, but for Finnish we have the comparatives described here, if it helps (2.11 and 2.12): http://tucs.fi/publications/attachment.php?fname=tHaverinen_Katri13a.full.pdf
@mavela : thanks! Those sections have been converted into part of the UD-Fi material on compar
and comparator
, but the status of these types in UD remains open (https://github.com/universaldependencies/docs/issues/73#issuecomment-60207017)
I wrote up a treatment of comparatives here:
http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives
based on what was in the DepLing 2013 paper, but updated towards UD, and making the discussion a fraction less English-specific, while not actually including any examples from other languages. :(
I presume it should be taken as tentative at this moment, and open for discussion and expansion.
I suspect that comparatives are infrequent enough that we don't want dedicated relations for comparatives in UD (relations for just one construction seem unfortunate). Besides, we're not meant to be able to change the relations in UD now for at least 12 months, Joakim says. :) However, it seems to me like the Finnish TDT analysis could be mapped onto the analysis here in a language-specific way by having compar as a specialization of advcl, and comparator as a specialization of mark.
Thank you for the detailed proposal!
I'll defer to the rest of the UD Finnish team regarding the mapping of TDT compar
and comparator
: @fginter , @jmnybl , @mavela , @ammiss , how do you feel about advcl:compar
and mark:comparator
?
From a quick check with @mavela, comparator
-> mark:comparator
looks unproblematic, and the comparator
dependencies in TDT comparative structures appear to match up with how mark
is used in analogous constructions in http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives . However, compar
is somewhat more challenging, and the TDT structures don't fully line up with those proposed for UD. (More to follow.)
I fully support Chris's proposal on this point. For languages where comparative constructions are sufficiently different from other adverbial clauses, language-specific subtypes may be used. (We may consider doing this for Persian, for example.) I am looking forward to hearing what the problems are with "compar" in Finnish.
Incidentally, the difficulty of deciding whether a bare nominal modifier should still be regarded as an elliptic clause or simply as a noun phrase is reflected in the discussion about which pronoun form to use. I think this is a classic case in prescriptive grammar for both English and Swedish, where grammarians would traditionally advocate "taller than I", with the argument that it is elliptic for "taller than I am", while popular usage tends to prefer "taller than me", suggesting that "than" works as a preposition here. Moreover, there are examples like "Peter is taller than his brother", which seems to be compatible with both a reflexive reading of "his" (taller than his own brother) and a general anaphoric reading (taller than someone else's theory). According to binding theory, if "than his brother" is a clause, the reflexive reading should not be possible.
Interesting. In Czech, the reflexive pronoun is a distinct word, so you would have either Petr je vyšší než jeho bratr (irreflexive) or *Petr je vyšší než svůj bratr (reflexive). Since you cannot say the latter (even if you actually mean Peter's brother), we could say that the comparative part is an elliptic clause in Czech.
(But of course, this does not necessarily mean that it is a clause also in English. One could argue that this is one of the many syntactic differences between the two languages.)
In TDT the both yhtä hyvä kuin Y (as good as Y) and parempi kuin Y (better than Y) structures are annotated so that the adjective (hyvä, parempi) is the head of the compar
dependency rather than the first adverb (yhtä, as) which quite commonly is absent. Since it is also possible to drop the adverb part from the positive form comparision (e.g. kova kuin kivi is used in place of yhtä kova kuin kivi), we annotated the compulsory part to be the head and the optional adverb to be it's dependent.
So if we map the TDT compar
onto a subtype of advcl
, our analysis still does not fully line up with the UD definition. It does not feel optimal to make the part which in most cases is not present to be the primary head.
@jmnybl : thank you! I believe this is the current TDT structure
There seems to be a difference even in English in that it is easier to view X as the head in the "as X as Y" construction than in the "more X than Y" construction. Whereas the first "as" seems optional, the "more" seems obligatory. However, the fact that "more" can also be realized morphologically does suggest that it should perhaps be treated as a function word, thus attaching the "compar/advcl" to its head instead. This would be compatible with our general treatment of function word modifiers, it would capture the essence of Chris's proposal, and it would facilitate the conversion of the Finnish treebank. These are obvious advantages. What are the disadvantages?
+1 for X as the head of compar/advcl
instead of as/more in as X as Y and more X than Y.
Also, wouldn't X as the head would improve parallelism also between some English constructions, e.g.
vs.
On the surface, yes, but you could argue that it attaches to -er (not hard-) in the second case, which makes the first pair more parallel. So it has to be coupled with the argument that "more" vs. "-er" is the type of "function-word-morphology-alternation" that we capture by never attaching anything to the function word.
Isn't "never attach anything to the function word" violated in these analyses?
We don't really have a principle that says "never attach anything to the function word". The principle is rather "whenever reasonable, attach to the content word head instead of to its function word dependent". But we have already defined a number of legitimate exceptions, one of which is adverbial modifiers (things like "almost every linguist", where "almost" attaches as "advmod" to "every"). So the question is whether this is another legitimate exception or not.
Fair enough, I'll accept a weaker position: the minor variant that Turku is proposing (content word as head, see above) has the advantage of not requiring the addition of a further exception to the list of cases where things attach to the function word.
(I'm afraid I don't seem to be able to decide between legitimate and other exceptions; could some of the factors going into the decision maybe be documented along with the rule and its exceptions?)
Sorry, I've been off busy with other stuff for a while....
I could be persuaded to take X as the head of the dependency to the comparative clause. It is certainly the minimal change that would make things consistent with Finnish. :)
If I'm honest, my reasons for being sympathetic to this analysis are that it would make things such as dependency conversion of Penn Treebank trees easier, and I suspect it would make parsing easier. Thus this was the way @mcdm and I had it in traditional SD....
Why we went with the opposite analysis is that @ngiordani and Tim Dozat felt fairly strongly that this was syntactically/semantically wrong, since (at least in English) the comparative clause is licensed by either more or the -er morpheme. This suggested extending the same analysis to as, but, as @jnivre notes, it's not really so clear for as, since you easily get that path is sharp as a razor's edge (though to my mind that does still sound kind of elliptical for as sharp as ...). At any rate, you don't get #She is intelligent than me. Well, you don't in standard English, but I googled around and it seems like you do get that sometimes in Indian English: Mahesh is Intelligent than Charan http://www.cinejosh.com/telugu-news-gossip/30145/mahesh-is-intelligent-than-charan.html; So don't worry if you think that you have a girl-friend, who is intelligent than you. http://www.forum.chillzee.in/lifestyle/fun-menu/3372-just-for-fun-an-intelligent-girl .
Nevertheless, I do basically think this argument is a sound point. It is the point of view of Huddleston and Pullum (2002), which discusses our choice of head as the "comparative governor" of the comparative phrase (p.1104). As they note and as discussed on the page http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives, this governor may not even be a modifier of the head X but a modifier of a modifier of the head, as in constructions like This may be a more serious problem than you think. If we make X the head, we do lose this link between the comparative governor and the comparative dependent.
But I do think there are arguments both ways, the other side being discussed above, and to the extent that the so-called comparative governor is often optional, it is clearly less compelling to take it as the head of the comparative dependent.
More discussion/comments and then make a decision?
I think it is clear that the more/-er element is the head, and I don’t think we should give this up just to facilitate conversion (whether for English or for Finnish). However, the fact that there is an alternation between function words and morphology here, even in English, suggests that we could treat it in a way that is analogous for what we do with “right down the street”, where we attach “right” to “street” because it modifies “down the street”. You could argue that we have a similar structure here:
((more difficult) than you think)
And since “difficult” is the head of “more difficult”, it should also be the head of the larger phrase.
However, this implies that in “a more difficult problem than you think”, we att “than” to “difficult” (not to “problem”), because there “more” modifies “difficult” and not “problem”.
On 09 Nov 2014, at 04:16, Christopher Manning notifications@github.com<mailto:notifications@github.com> wrote:
Sorry, I've been off busy with other stuff for a while....
I could be persuaded to take X as the head of the dependency to the comparative clause. It is certainly the minimal change that would make things consistent with Finnish. :)
If I'm honest, my reasons for being sympathetic to this analysis are that it would make things such as dependency conversion of Penn Treebank trees easier, and I suspect it would make parsing easier. Thus this was the way @mcdmhttps://github.com/mcdm and I had it in traditional SD....
Why we went with the opposite analysis is that @ngiordanihttps://github.com/ngiordani and Tim Dozat felt fairly strongly that this was syntactically/semantically wrong, since (at least in English) the comparative clause is licensed by either more or the -er morpheme. This suggested extending the same analysis to as, but, as @jnivrehttps://github.com/jnivre notes, it's not really so clear for as, since you easily get that path is sharp as a razor's edge (though to my mind that does still sound kind of elliptical for as sharp as ...). At any rate, you don't get #She is intelligent than me. Well, you don't in standard English, but I googled around and it seems like you do get that sometimes in Indian English: Mahesh is Intelligent than Charan htt p://www.cinejosh.com/telugu-news-gossip/30145/mahesh-is-intelligent-than-charan.htmlhttp://www.cinejosh.com/telugu-news-gossip/30145/mahesh-is-intelligent-than-charan.h%20tml; So don't worry if you think that you have a girl-friend, who is intelligent than you. http://www.forum.chillzee.in/lifestyle/fun-menu/3372-just-for-fun-an-intelligent-girl .
Nevertheless, I do basically think this argument is a sound point. It is the point of view of Huddleston and Pullum (2002), which discusses our choice of head as the "comparative governor" of the comparative phrase (p.1104). As they note and as discussed on the page http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives, this governor may not even be a modifier of the head X but a modifier of a modifier of the head, as in constructions like This may be a more serious problem than you think. If we make X the head, we do lose this link between the comparative governor and the comparative dependent.
But I do think there are arguments both ways, the other side being discussed above, and to the extent that the so-called comparative governor is often optional, it is clearly less compelling to take it as the head of the comparative dependent.
More discussion/comments and then make a decision?
— Reply to this email directly or view it on GitHubhttps://github.com/UniversalDependencies/docs/issues/104#issuecomment-62289200.
Let me see if I understand Joakim's point:
@jnivre, you're saying that the 'than/as...' phrase is dependent on 'more', but since 'more' is a functional element (as evidenced by the alternation with a bound morpheme), it makes sense to attach it to the head of 'more'. This is parallel to what we do in 'right down the street'. Sound right?
I think that's reasonable. One question, though: what would we do with
Wheat raises blood sugar even more than sugar.
Yes, that is exactly my point. For the new example, we should do the obvious thing and attach "than" to "more". This can be motivated in two different ways. Either we view "more than sugar" as elliptic for "more (rapidly) than sugar" (or something like that), in which case the attachment to "more" is a case of "function word promotion by head elision" (see general principles in the guidelines). Or we view "more" as a content word meaning "to a higher extent", in which case this case is parallel to "faster than sugar".
More discussion/comments and then make a decision?
I think both alternatives have been fairly presented and would welcome a decision. How to proceed?
If you ask me, my proposal reconciles the differences between the two proposals. Both end up attaching "than" to "fun" in "more fun than I expected", but with different theoretical motivations. :)
The only potential discrepancy are cases like "a more difficult problem than I expected". Here the promotion is from "more" to "difficult" (the content word head), not to "problem" (the head of the noun phrase). What does TDT do in this case?
In TDT "difficult" is the head also in cases where it modifies a noun.
I'm personally happy to support @jnivre's motivation of the content-head proposal. (Would it perhaps work to first decide between the "as/more-head" and content-head alternatives and discuss possible remaining variants afterward?)
Great! Then we just need to know that the Stanford/Ohio group is okay with attaching to the content word in analogy with what we do for modifiers of prepositions.
Note to myself: In PDT, we treat "more" and "less" as any other adverbial modifier of the adjective (so it is not a function word) and we allow it to have dependents if necessary. It is extremely difficult to find examples of comparative constructions using them, because comparatives are mostly morphological in Czech. But I found one [cmpr9406_005.a.gz]:
tatra byla méně kvalitní vůz než jiné vozy lit. "tatra was less good car than other cars"
Adv(kvalitní, méně) AuxC(méně, než) ExD(než, vozy)
The ExD
relation says that this is an elliptical construction and that a verb is missing but anticipated.
If my understanding of the example is correct, this is in line with the original proposal from Stanford, but it could be changed to the new proposal by reattaching to the content head (in this case an adjective).
Well, the transformation could be done, yes. We could also say that "less" is the content word. In fact it is itself a comparative form of the adverb málo = "little" (so we actually still have morphological gradation, though irregular in the case of málo).
Since it is so rare, I am not going to argue against the proposed solution. I just wanted to save the example for the time when I will be writing the corresponding description of the Czech data.
HI everyone,
I'm sorry, I failed to report the discussion we had at Stanford a few days ago -- we actually agree that Joakim's proposal works across languages. We're happy to adopt it for English! But we do think that the comparative clause should attach to 'difficult', not 'problem'.
I think we're ready to close this issue, right?
@ngiordani : Great, happy that we agree on this!
I think we're ready to close this issue, right?
One bit remains: the docs at http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives should be updated to reflect the decision. (Can you at Stanford take care of this?)
I'll draft an update of the docs.
I made a minimal update and revised the examples to content-head form. Cross-check would be appreciated. Also, we should probably add @jnivre's motivation of the content-head alternative to the docs.
Thanks, Sampo! I can finish the documentation later today.
N.
On Thu, Nov 20, 2014 at 5:59 AM, Sampo Pyysalo notifications@github.com wrote:
I made a minimal update and revised the examples to content-head form. Cross-check would be appreciated. Also, we should probably add @jnivre https://github.com/jnivre's motivation of the content-head alternative to the docs.
— Reply to this email directly or view it on GitHub https://github.com/UniversalDependencies/docs/issues/104#issuecomment-63811317 .
@ngiordani : are you likely to modify this part of the documentation anytime soon? I thought I might add some examples from Finnish if not.
Sorry @spyysalo, dropped the ball on this! It's done now.
Great, thanks, I'll close this and add the Finnish examples to the revised docs.
Hm, reopening because I just thought of something else... shouldn't the dependent clause now be acl? Thoughts, @spyysalo, @manning, @mcdm, @jnivre, @tdozat?
(Note: what I meant is, should it be acl when the head is a noun. As in "more flour than necessary".)
Do you mean that in "more sausages than you bought last week", we would get "acl", but in "more important than you thought last week" it would be "advcl"?
Yeah, that's what I mean. Since acl is supposed to modify nominals...
On Mon, Dec 1, 2014 at 6:48 PM, mcdm notifications@github.com wrote:
Do you mean that in "more sausages than you bought last week", we would get "acl", but in "more important than you thought last week" it would be "advcl"?
— Reply to this email directly or view it on GitHub https://github.com/UniversalDependencies/docs/issues/104#issuecomment-65175892 .
(reopening in tracker)
@ngiordani @mcdm : are there still open questions here or can this be closed?
Well, I'm assuming we'll differentiate between acl and advcl in comparatives. So far no one's complained. Maybe @manning or @jnivre can give a nod here?
On Fri, Dec 12, 2014 at 12:02 AM, Sampo Pyysalo notifications@github.com wrote:
@ngiordani https://github.com/ngiordani @mcdm https://github.com/mcdm : are there still open questions here or can this be closed?
— Reply to this email directly or view it on GitHub https://github.com/UniversalDependencies/docs/issues/104#issuecomment-66742612 .
Nod
J
On 12 Dec 2014, at 20:39, ngiordani notifications@github.com<mailto:notifications@github.com> wrote:
Well, I'm assuming we'll differentiate between acl and advcl in comparatives. So far no one's complained. Maybe @manning or @jnivre can give a nod here?
On Fri, Dec 12, 2014 at 12:02 AM, Sampo Pyysalo notifications@github.com<mailto:notifications@github.com> wrote:
@ngiordani https://github.com/ngiordani @mcdm https://github.com/mcdm : are there still open questions here or can this be closed?
— Reply to this email directly or view it on GitHub https://github.com/UniversalDependencies/docs/issues/104#issuecomment-66742612 .
— Reply to this email directly or view it on GitHubhttps://github.com/UniversalDependencies/docs/issues/104#issuecomment-66823647.
I'm okay with this. But I do just want to raise a possible alternative analysis, now that I've thought about it for a little.
It's not clear that the 'more sausages' case is parallel to 'more important'. Because in the former case, there is no bound morpheme comparative alternative, and I doubt there are languages that don't have a word for "more" or "large amount" in this case. So, should we possibly in this case have the dependent comparative clause be a dependent of "more", and then it would still (consistently) be a advcl. It would be like seeing "more" as elliptical for "more numerous sausages".
Actually, I think this new proposal makes a lot of sense. @jnivre, are you on board?
On Sat, Dec 13, 2014 at 2:22 PM, Christopher Manning < notifications@github.com> wrote:
I'm okay with this. But I do just want to raise a possible alternative analysis, now that I've thought about it for a little.
It's not clear that the 'more sausages' case is parallel to 'more important'. Because in the former case, there is no bound morpheme comparative alternative, and I doubt there are languages that don't have a word for "more" or "large amount" in this case. So, should we possibly in this case have the dependent comparative clause be a dependent of "more", and then it would still (consistently) be a advcl. It would be like seeing "more" as elliptical for "more numerous sausages".
— Reply to this email directly or view it on GitHub https://github.com/UniversalDependencies/docs/issues/104#issuecomment-66893919 .
Sure. If I understand correctly then, the comparative clause always attaches as advcl to an adjective or adverb, never to a noun, and it attaches to "more" if there is no explicit adjective or adverb. Thus:
more difficult than you think advcl(difficult, think) harder than you think advcl(harder, think) more rapidly than you think advcl(rapidly, think) a more difficult problem than you think advcl(difficult, think) more problems than you think advcl(more, think)
Is that correct?
Yes correct. I'll put that in the documentation. And call it closed.
Thanks for the comprehensive documentation (http://universaldependencies.github.io/docs/u/overview/specific-syntax.html#comparatives) and sorry if it turns out that I did not read it attentively enough but I could not find a solution (or analogy) to this:
[1] Home prices have more than doubled in the past decade.
Attaching doubled to more does not seem quite right to me, because doubled contains both the action modified by quantity/degree, and the base quantity for the comparison. A paraphrase easier to analyze would be
[2] Home prices have increased more than twice in the past decade.
where I guess we would want advmod(increased, more) advmod(more, twice) [or advcl???] mark(twice, than)
In order to make the two above examples somewhat parallel, I am inclined to analyze the former as advmod(doubled, more) mark(more, than)
with the assumption that the quantity compared to has been elided (although it actually has been incorporated into the verb).
The actual example from the data that led me to think about this was a bit different:
[3] more than thirty-years-lasting experience (cs: více než třicetileté zkušenosti)
(It was in Czech and thirty-years-lasting is one word, and it is an adjective.) I am not sure whether [3] will have the same solution as [1] though. If we paraphrase it as
[4] older than thirty-years-lasting experience
then we have amod(experience, older/more) advcl(older/more, thirty-years-lasting) mark(thirty-years-lasting, than)
What do people think?
I don't see any guidelines (with examples) on annotating comparative constructions. See http://ufal.mff.cuni.cz/project/depling13/proceedings/pdf/W13-3721.pdf for a possible (pre-UD) solution.