amrisi / amr-guidelines

250 stars 87 forks source link

RED+AMR annotation discussion on whether to annotate number #180

Closed timjogorman closed 8 years ago

timjogorman commented 8 years ago

The Argument for encoding number

Sorry for the deluge of posts here. Here's a set of arguments for why we want to be annotating some kind of number or quantification in the multi-sentence AMR pass.

General claim:

-"Entity Coreference" as a single kind of chain. This is the same thing we did in the last pass, but you have three different kinds of "mentions".

 X / entity
           :instance-of "gang" 
           :constist-of (X2 / entity 
                     :instance-of thug
                     :instance-of person
                     :instance-of yob
                     :quant 4
nschneid commented 8 years ago

I think this is an important discussion to have; thanks for getting it started!

One case that comes to mind is a sentence in a paper draft I saw recently: "This table shows the performances of the systems." I understand why "performances" was pluralized here (each system has a different score), but "performance"—aggregating over all systems—sounds more natural to me (I guess I prefer it to be a mass noun). Would annotators have to decide whether "performance(s)" is unique or multiple? Or would that not count as a discourse referent?

nschneid commented 8 years ago

Another question: As I understand it, Unique as described above would include collective nouns (like "gang" in "gang of thugs") as well as non-collective individual referents (like "president"). Should these be separated out? It might be more natural to think of two dimensions:

unique multiple
individual president gang(s) of thugs
collective gang of thugs gangs of thugs
timjogorman commented 8 years ago

"This table shows the performances of the systems."

I'd tend to want "performance" to be an event, which will require different distinctions. I'll try to get that Events pass up this week.

Should these be separated out? It might be more natural to think of two dimensions:

My own instinct is that having them together gives us a slightly more robust approach (we don't need to classify things into "collective nouns" or anything like that), but it might be worth discussing on the call

nschneid commented 8 years ago

My own instinct is that having them together gives us a slightly more robust approach (we don't need to classify things into "collective nouns" or anything like that), but it might be worth discussing on the call

[Sorry I couldn't join the call (was on an airplane).]

What would be the treatment of "gangs of thugs" if only the unique/multiple distinction is available? Wouldn't 'gangs' and 'thugs' both be multiple? Or, more precisely: multiple gangs, each consisting of multiple thugs?