Closed timjogorman closed 8 years ago
I think this is an important discussion to have; thanks for getting it started!
One case that comes to mind is a sentence in a paper draft I saw recently: "This table shows the performances of the systems." I understand why "performances" was pluralized here (each system has a different score), but "performance"—aggregating over all systems—sounds more natural to me (I guess I prefer it to be a mass noun). Would annotators have to decide whether "performance(s)" is unique or multiple? Or would that not count as a discourse referent?
Another question: As I understand it, Unique as described above would include collective nouns (like "gang" in "gang of thugs") as well as non-collective individual referents (like "president"). Should these be separated out? It might be more natural to think of two dimensions:
unique | multiple | |
---|---|---|
individual | president | gang(s) of thugs |
collective | gang of thugs | gangs of thugs |
"This table shows the performances of the systems."
I'd tend to want "performance" to be an event, which will require different distinctions. I'll try to get that Events pass up this week.
Should these be separated out? It might be more natural to think of two dimensions:
My own instinct is that having them together gives us a slightly more robust approach (we don't need to classify things into "collective nouns" or anything like that), but it might be worth discussing on the call
My own instinct is that having them together gives us a slightly more robust approach (we don't need to classify things into "collective nouns" or anything like that), but it might be worth discussing on the call
[Sorry I couldn't join the call (was on an airplane).]
What would be the treatment of "gangs of thugs" if only the unique/multiple distinction is available? Wouldn't 'gangs' and 'thugs' both be multiple? Or, more precisely: multiple gangs, each consisting of multiple thugs?
The Argument for encoding number
Sorry for the deluge of posts here. Here's a set of arguments for why we want to be annotating some kind of number or quantification in the multi-sentence AMR pass.
General claim:
Implementation proposal
-"Entity Coreference" as a single kind of chain. This is the same thing we did in the last pass, but you have three different kinds of "mentions".
What would "Unique" mean?
What would "Multiple" mean?
What would "FullDefinition" mean?
Revisiting the coreference chain we've discussed.