Hope, heat -- share your opinion!

cbonial commented 10 years ago

Ulf, Kevin and I have been having an email discussion about whether or not "hopeful" should be represented as hope-01, and whether or not the adjective "hot" (as in temperature) should be unified with the heat-v, heated-j. Please read and share your opinion.

"Hopeful" Argument 1: the meanings of hope-01 and hopeful are at least half an inch apart.

hopeful: "feeling or inspiring optimism about a future event" (OED)

"I hope that the Democrats will win a majority in both houses of Congress, but I am not hopeful that it will actually happen."

And a constantly worried little old little old lady who hopes that there are no monsters under the bed and that nobody will break in and kill her, is not necessarily a hopeful person.

This means we should probably have a frame hopeful-41 or so since it typically comes with two arguments.

"Hopeful" Argument 2: Despite these pragmatic differences, "hope" and "hopeful" would have essentially the same argument structure -- arg0-hoper, hopeful person, arg1-thing hoped-for, hopeful about. Thus, we should stem to "hope" when we see the adjective "hopeful." This is no greater stretch than stemming "aware" to "realize," as these words have significant pragmatic and aspectual differences (but have very similar arguments/roles).

"Hot" Argument 1: heat-01 and hot don't pass the 1/8-inch test either. If a swimming pool is heated from 33 to 34 degrees F, it's still not hot.

"Hot" Argument 2: Again, the arguments (although implicit for some parts of speech) are the same for "hot" and "heated," (cause of heat, thing heated) despite the fact that the pragmatics of these concepts, like hope/hopeful, can be very different. It's much more important to capture semantic differences like "a heated conversation" or "a hot girl," (which would be distinct rolesets) rather than the difference between these two temperature-related concepts.

Meta-comments on lumping vs. splitting:

for a machine translation vision, we tend to want to split (retain relevant info)
for a machine reading vision, we want to lump (fewer primitives, fewer inference rules)

Please leave a comment on what you think about linking "hopeful" to hope-01 and "hot" to heat-01.

amrisi commented 10 years ago

At a more general level, I hope that we can identify underlying principles on unifying frames.

We currently preserve very fine distinctions between verbs, e.g.

purchase.01 ("to buy")
buy.01 ("to buy")
buy.02 ("to buy up")
buy.03 ("to buy out")

All of the above have the same frame structure and very similar meanings, yet we use different frames.

Even the following have an identical frame structure and a similar meaning (murder entails kill):

murder.01
kill.01

We also currently annotate the following pairs differently:

my father vs. my dad
person vs. boy

But there is a proposal to unify hopeful and hope, apparently because their frame structures are the same, they look kind of similar, and have a somewhat similar meaning (hopeful entails hope).

Does that mean that a major factor in our decision whether or not to lump is based on parts of speech (discriminating against adjectives, compared to verbs) and based on English spelling (hopeful = hope, but purchase ≠ buy).

Are we going for some sort of part-of-speech-based and spelling-based semantics?

Heavy lumping might give us a little head start for inferencing, but I am hopeful (and not merely hope) that there are better ways to capture entailment and other similarity relations between concepts such as "murder" and "kill" or "heat" and "hot".

kevincrawfordknight commented 10 years ago

i think these meanings fall within 1/8 inch (or within "0.1 radians", as they say in vector semantics. uh oh, i made a deeply bad joke there).

if our concepts were synsets instead of words (as in a forerunner of AMR), "purchase" and "buy" might indeed have the same AMR, and should. synsets don't merge words with different parts of speech, so where we draw the line is up to us. i like unifying hope and hopeful, heat and hot.

cbonial commented 10 years ago

I think that we should be moving towards an ontology of concepts with AMR, where we could explicitly mark the similarity between "purchase" and "buy," as these concepts are nearly interchangeable in semantics, and to me, differ primarily in register.

I think it's a bit misleading to say that we are unifying concepts based on "English spelling," they are etymologically related concepts that do share a great deal of semantic commonalities since they do share a root. We do, of course, separate out concepts that have different semantics and distinct argument structures, although they share spelling, as the example of a "heated conversation" or a "hot girl" exemplify. This has always been the practice of PropBank: add new rolesets only when the syntax (argument structure, roles generally) and semantics are distinct. This makes for coarse-grained senses that have been shown to be good for machine learning.

Although we have given a few examples demonstrating that usages of "hope" and "hopeful" can be distinct, I think that these are quite rare, and a survey of average sentences with either one could be easily replaced by the other. I hope/am hopeful that this makes sense.

nschneid commented 10 years ago

I am hopeful (and not merely hope) that there are better ways to capture entailment and other similarity relations between concepts such as "murder" and "kill" or "heat" and "hot".

As a splitter, I am sympathetic with @amrisi's suggestion, at least in general (I am not sure about "hopeful" in particular). Rather than urge annotators to make potentially uncomfortable leaps between related meanings, why not just make these relations explicit at a type level in the lexicon? That way the finer-grained distinctions will be available should they become important (say, for translation). For a single word, having many senses places a strain on annotators because they have to read all the definitions to choose the best one; but if derivational morphology already distinguishes the senses, and the difference in meaning is clear to annotators, then I see nothing to lose by keeping the distinction in the annotation itself and marking the commonality in the lexicon.

Machine learning algorithms, BTW, really ought be able to use type-level relatedness information to obtain more robust generalizations, and if evaluation measures account for relatedness then I see no advantage to lumping unless it really leads to cheaper or more reliable annotation.

kevincrawfordknight commented 10 years ago

In cases like these, I believe the annotation can be easy & consistent, if these frames are merged in PropBank and we move to them on unification day.

nschneid commented 10 years ago

But for cases where the meanings are related but different, what does merging buy us that we could not get by linking between the frames?

kevincrawfordknight commented 10 years ago

We try to lump when meanings are essentially the same, and when annotators can do it. This reduces the AMR vocabulary. Down the road, if others additionally build axioms or link frames, that will of course be great.

Hoping but not being hopeful is a fine distinction for me... BTW, we didn't talk about "hopefully", but I like how annotators have turned this into an arg0-less "hope-01". They totally get that "hopefully, he will go" doesn't mean ":manner (h / hopeful)"!

nschneid commented 10 years ago

Merging hope/hopeful doesn't bother me quite as much as the idea of merging heat with hot. The difference between an attribute and a change of state seems awfully fundamental and worth preserving. But by that standard, it seems we'd have to start using whiten for white, worsen for bad, heighten for high, embolden for bold, etc. Unless it's possible for those to have separate entries/concept names but share a common roleset. On May 22, 2014 1:58 PM, "Kevin Knight" notifications@github.com wrote:

We try to lump when meanings are essentially the same, and when annotators can do it. This reduces the AMR vocabulary. Down the road, if others additionally build axioms or link frames, that will of course be great.

Hoping but not being hopeful is a fine distinction for me... BTW, we didn't talk about "hopefully", but I like how annotators have turned this into an arg0-less "hope-01". They totally get that "hopefully, he will go" doesn't mean ":manner (h / hopeful)"!

— Reply to this email directly or view it on GitHubhttps://github.com/amrisi/amr-guidelines/issues/118#issuecomment-43929879 .

uhermjakob commented 10 years ago

A challenge for lumpers: a hopeful sign

nschneid commented 10 years ago

JULIET: Yea, noise? then I'll be brief. O happy dagger! This is thy sheath; there rest, and let me die.

kevincrawfordknight commented 10 years ago

(s / sign :cause (h / hope-01))

uhermjakob commented 10 years ago

happy dagger

I can see a touch of this in a "hopeful sign". However, dictionaries such as the Oxford American Dictionary list "hopeful sign" as a first example for "hopeful", implying that it is typical.

(s / sign :cause (h / hope-01))

I don't think that a "hopeful sign" is one that creates a (new) hope, but instead-of-91 one that increases the degree of optimism that a pre-existing hope wlil be fulfilled.

OED: hope·ful

feeling or inspiring optimism about a future event.
"a hopeful sign"

cbonial commented 10 years ago

In these cases, and in all cases of unification that are worrisome, it is always possible for us to add new rolesets/senses. So although the unified frame file is, by convention, a listing that combines both hope-v and hopeful-j, we can always have unique rolesets that are only associated with one lemma/part of speech or another.

I mention this because I think what we're seeing here is that arguably there are two senses for "hopeful": one that is strongly related to "hope" and even takes the same syntactic types of arguments: "I am hopeful that I will recover from this disease," or "I am hopeful that you like it" (both web examples). In both of these cases, we cannot be certain if the speaker is merely hoping X, or is expressing a degree of optimism/likelihood that X will happen, but the former seems the safer assumption to me. In other cases that Ulf has put forward, it is quite clear from context that the speaker is expressing their degree of optimism/likelihood: "I hope that the Democrats will win a majority in both houses of Congress, but I am not hopeful that it will actually happen."

These two senses are also reflected in WordNet: S: (adj) hopeful#1 (having or manifesting hope) "a line of people hopeful of obtaining tickets"; "found a hopeful way of attacking the problem" S: (adj) bright#10, hopeful#2, promising#2 (likely to turn out well in the future) "had a bright future in publishing"; "the scandal threatened an abrupt end to a promising political career"; "a hopeful new singer on Broadway"

If you would like us to add a new roleset for the second sense of "hopeful," we can certainly do that. I feel, and have amply expressed, that the first sense can be covered by the unified hope-v/hopeful roleset. I personally do not think that most people will be able to tell these senses apart in annotation, and it would be a rather fine-grained distinction for normal PropBank roleset practices. Nonetheless, we can add the sense and see if annotators are able to make the distinction in practice.

amrisi / amr-guidelines

Hope, heat -- share your opinion! #118