UniversalConceptualCognitiveAnnotation / docs

UCCA Documentation
https://universalconceptualcognitiveannotation.github.io/
10 stars 1 forks source link

Approximator, distance-AWAY #35

Open nschneid opened 5 years ago

nschneid commented 5 years ago

"John lives about 3 miles away"

John_A lives_S [[about_E 3_E miles_C]_Q away_R]_A ?

omriabnd commented 5 years ago

John_A lives_S [about_E 3_E miles_C west of here]_A

On Mon, Oct 1, 2018 at 9:49 PM Nathan Schneider notifications@github.com wrote:

"John lives about 3 miles away"

John_A lives_S [[about_E 3_E miles_C]_Q away_R]_A ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86_KyQ2q9fwXy10OwwRZgy7p9M5Liks5ugmO_gaJpZM4XCq8m .

nschneid commented 5 years ago

The original example I encountered actually had "to": something like "They moved to about 3 miles away".

nschneid commented 5 years ago
nschneid commented 5 years ago
omriabnd commented 5 years ago

Dotan, could you add this to the interesting examples section?

nschneid commented 5 years ago

"quantity modifiers, distances, and directions"

dotdv commented 5 years ago

Sure, added. Small question regarding: 1) appeared_P [from_R behind_R the_E couch_C]_A 2) cities_C [north_R of_R DC_C]_E In the 'directions' section we haven't mentioned this option of marking two Rs, but only the option of a united UNA R in the case of a multiworded preposition. Do we want to add these solutions to the main section? if so how should we explain the difference between the options? For example "north of" seems to stand the test we use to define a mutliworded R: 'north' and 'of' can't be used separately in this phrase. Marking it two Rs can make sense to me, but I just want to understand whether it was marked differently on purpose.

dotdv commented 5 years ago

A different question regarding all the non-scene examples: In the guidelines we say that "by convention, we place the Rs in non-Scene units as siblings of the Es, Qs and Cs they relate", so for example instead of "cities_C [north_R of_R DC_C]_E" according to the guidelines it should be:
cities_C north_R of_R DC_E

omriabnd commented 5 years ago

Good point.

On Tue, Oct 9, 2018 at 2:01 PM dotdv notifications@github.com wrote:

A different question regarding all the non-scene examples: In the guidelines we say that "by convention, we place the Rs in non-Scene units as siblings of the Es, Qs and Cs they relate", so for example instead of "cities_C [north_R of_R DC_C]_E" according to the guidelines it should be: cities_C north_R of_R DC_E

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428150427, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86yEVKb-PUpYldprLUmtzhDvlDToaks5ujIIigaJpZM4XCq8m .

nschneid commented 5 years ago

by convention, we place the Rs in non-Scene units as siblings of the Es, Qs and Cs they relate

Should there be an exception for coordinated modifiers?

"cities north of DC and south of Boston": cities_C [[north_R of_R DC_C]_C and_N [south_R of_R Boston_C]_C]_E ?

Regarding multiword prepositions, it is clear that some are fixed expressions and therefore UNA (e.g. "in front of", "next to"). By contrast, "from" can be stacked with a locative preposition in a motion scene: "the cat emerged from behind/next to/under/... the couch". I could go either way on cardinal direction + OF, e.g. "north of".

nschneid commented 5 years ago

And what if there are multiple adnominal PPs? "the party on Saturday from 8 to midnight"

omriabnd commented 5 years ago

Why is that an issue? isn't that just multiple As?

On Tue, Oct 9, 2018 at 4:40 PM Nathan Schneider notifications@github.com wrote:

And what if there are multiple adnominal PPs? "the party on Saturday from 8 to midnight"

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428196047, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86z47gdeiR8xsxx-6N9EnD3ErqMnFks5ujKdlgaJpZM4XCq8m .

nschneid commented 5 years ago

Oops, I forgot "party" was scene-evoking. Let's make it: "the house on Main Street between the school and the church"

omriabnd commented 5 years ago

I think it should be two Es.

On Tue, Oct 9, 2018 at 5:34 PM Nathan Schneider notifications@github.com wrote:

Oops, I forgot "party" was scene-evoking. Let's make it: "the house on Main Street between the school and the church"

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428216482, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86083qcAmYUSt72fCGpylf2DdM4etks5ujLPZgaJpZM4XCq8m .

nschneid commented 5 years ago

You mean: the_E house_C on_R [Main Street]_E between_R [the_E school_C]_E and_N [the_E church_C]_E?

This leaves it unclear which R's go with which E's, which could be a problem for supersense alignment.

omriabnd commented 5 years ago

I mean: the_E house_C on_R [Main Street]_E between_R [[the_E school_C]_C and_N [the_E church_C]_C]_E

On Tue, Oct 9, 2018 at 8:25 PM Nathan Schneider notifications@github.com wrote:

You mean: the_E house_C on_R [Main Street]_C between_R the_E school_C and_N the_E church_C?

This leaves it unclear which R's go with which C's, which could be a problem for supersense alignment.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428279101, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86_20jOIUVxNxOUhXRXcGjFqwevY2ks5ujNwjgaJpZM4XCq8m .

nschneid commented 5 years ago

So "on" and "between" are at the same level even though they pertain to different elaborators. Is that going to cause problems?

omriabnd commented 5 years ago

we can use a syntactic parser to do the PSS alignment.

On Tue, Oct 9, 2018 at 8:29 PM Nathan Schneider notifications@github.com wrote:

So "on" and "between" are at the same level even though they pertain to different elaborators. Is that going to cause problems?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428280334, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG869J12XKVIXSnKw8vXiTNqxlDv0D-ks5ujN0BgaJpZM4XCq8m .

nschneid commented 5 years ago

But from a theoretical perspective, why associate the R with the "case-marked" expression under a scene, but not in a non-scene unit?

omriabnd commented 5 years ago

I think it's because then it becomes confusing when there's more than one C, or when the C is the second, such as in: "[two barrels of]_Q smoked_E herring_C". if we were to put "of" with its prepositional object we would end up with "[two barrels]Q of{C-} smokedE herring{-C}" which is confusing.

On Tue, Oct 9, 2018 at 8:33 PM Nathan Schneider notifications@github.com wrote:

But from a theoretical perspective, why associate the R with the "case-marked" expression under a scene, but not in a non-scene unit?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428281707, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG867L60AtRGry24qymDudwHNyHQAnPks5ujN4DgaJpZM4XCq8m .

nschneid commented 5 years ago

Oh, and not [two barrels]_Q [of_R smoked_E herring_C]_C because that would be a C within a C?

I wonder if the policy should be limited to "of" (and equivalent genitive/quantity markers in other languages). It feels weird for "the party on the street" and "the house on the street" to be so structurally different.

nschneid commented 5 years ago

To put it another way: I don't think you'd ever have a preposition other than "of" followed by a C. So maybe the policy should be limited to C-relators as opposed to E/A/etc.-relators.

jakpra commented 5 years ago

For Quantities, I see why we don't necessarily want to group the R with the center. I think it should be "[two barrels]_Q of_R smoked_E herring_C" right?

But if the pobj is an Elaborator, I agree with Nathan that it would be more intuitive and consistent and less ambiguous if it was the_E house_C [on_R Main Street]_E [between_R [the_E school_C]_C and_N [the_E church_C]_C]_E

omriabnd commented 5 years ago

What about "two types of smoked herring"? where "types" is not a C, and maybe the next language would have some other rationale. My point is that where Rs should be is just a matter of convention, we chose the simplest option for annotators, but if you don't like it, we can easily post-process the corpus to put the Rs wherever we want in close to 100% accuracy.

On Tue, Oct 9, 2018 at 8:46 PM Jakob Prange notifications@github.com wrote:

For Quantities, I see why we don't necessarily want to group the R with the center. I think it should be "[two barrels]_Q of_R smoked_E herring_C" right?

But if the pobj is an Elaborator or aDverbial, I agree with Nathan that it would be more intuitive, consistent and unabmibguous if it was the_E house_C [on_R Main Street]_E [between_R [the_E school_C]_C and_N [the_E church_C]_C]_E

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/UniversalConceptualCognitiveAnnotation/docs/issues/35#issuecomment-428285664, or mute the thread https://github.com/notifications/unsubscribe-auth/AIG86-zzlk-ADf7mdubaywdabuhVQO4Iks5ujODlgaJpZM4XCq8m .

nschneid commented 5 years ago

What about "two types of smoked herring"? where "types" is not a C

But "herring" is a C, right? The policy would be that an R that marks a C (syntactically, "of" applies to "herring") should not form a nested unit, but R's marking E's, A's, etc. should.

I suspect this policy would actually be easier for annotators to remember.

dotdv commented 5 years ago

I went over the list of issues Jakob sent and it looks like the main open issue is how to deal with Rs. Apart from non-scene units, I think it will also help if we clarify how Rs should be marked when they pertain to a P/S (at the moment it's not mentioned at all in the guidelines).