UniversalDependencies / UD_English-GUMReddit

Other
1 stars 2 forks source link

Plural nouns not using singular lemma, possible pluralia tantum #15

Closed rhdunn closed 11 months ago

rhdunn commented 11 months ago

These are instances of nouns (NN) and proper nouns (NNPS) marked as plurals (Number=Plur) where the lemma is the plural form. Each of these (on a case by case basis) should either:

  1. use the singular lemma form;
  2. use Number=Ptan to mark them as plurale tantum -- see also https://github.com/UniversalDependencies/UD_English-EWT/issues/374.
ERROR: Sentence GUM_reddit_macroeconomics-4 token 10 -- NNS/Number=Plur lemma 'thanks' does not match plural-common-noun applied to form 'thanks', expected 'thank'
ERROR: Sentence GUM_reddit_pandas-34 token 3 -- NNS/Number=Plur lemma 'species' does not match lemma-exception applied to form 'species', expected 'specie'
ERROR: Sentence GUM_reddit_pandas-40 token 12 -- NNS/Number=Plur lemma 'species' does not match lemma-exception applied to form 'species', expected 'specie'
ERROR: Sentence GUM_reddit_pandas-46 token 2 -- NNS/Number=Plur lemma 'species' does not match lemma-exception applied to form 'species', expected 'specie'
ERROR: Sentence GUM_reddit_bobby-49 token 4 -- NNS/Number=Plur lemma 'competitor' does not match plural-common-noun applied to form 'competetors', expected 'competetor'
ERROR: Sentence GUM_reddit_conspiracy-7 token 2 -- NNS/Number=Plur lemma 'thanks' does not match plural-common-noun applied to form 'Thanks', expected 'thank'
ERROR: Sentence GUM_reddit_conspiracy-26 token 22 -- NNS/Number=Plur lemma 'means' does not match plural-common-noun applied to form 'means', expected 'mean'
ERROR: Sentence GUM_reddit_conspiracy-55 token 9 -- NNS/Number=Plur lemma 'dinosaur' does not match plural-common-noun applied to form 'dinasaurs', expected 'dinasaur'
ERROR: Sentence GUM_reddit_racial-9 token 42 -- NNS/Number=Plur lemma '1920s' does not match plural-common-noun applied to form '1920s', expected '1920'
ERROR: Sentence GUM_reddit_racial-9 token 58 -- NNS/Number=Plur lemma '1960s' does not match plural-common-noun applied to form '1960s', expected '1960'
ERROR: Sentence GUM_reddit_racial-13 token 38 -- NNS/Number=Plur lemma '1960s' does not match plural-common-noun applied to form '1960s', expected '1960'
ERROR: Sentence GUM_reddit_racial-15 token 6 -- NNS/Number=Plur lemma 'American' does not match plural-common-noun applied to form 'Americans', expected 'american'
ERROR: Sentence GUM_reddit_social-3 token 44 -- NNS/Number=Plur lemma 'barbecue' does not match plural-common-noun applied to form 'BBQs', expected 'bbq'
ERROR: Sentence GUM_reddit_social-24 token 8 -- NNS/Number=Plur lemma 'barbecue' does not match plural-common-noun applied to form 'BBQs', expected 'bbq'
ERROR: Sentence GUM_reddit_social-33 token 2 -- NNS/Number=Plur lemma 'thanks' does not match plural-common-noun applied to form 'Thanks', expected 'thank'
ERROR: Sentence GUM_reddit_stroke-3 token 7 -- NNS/Number=Plur lemma 'surroundings' does not match plural-common-noun applied to form 'surroundings', expected 'surrounding'
ERROR: Sentence GUM_reddit_stroke-6 token 39 -- NNS/Number=Plur lemma 'surroundings' does not match plural-common-noun applied to form 'surroundings', expected 'surrounding'

The following is semantically plural, but syntactically singular. I.e. the plurality is only derivable from context.

ERROR: Sentence GUM_reddit_polygraph-14 token 8 -- NNS/Number=Plur lemma 'DDD' does not match plural-common-noun applied to form 'DDD', expected 'ddd'
amir-zeldes commented 11 months ago

For the first part, let's discuss in the Ptan issue, that should resolve these. For the second, I don't know what's syntactically singular about that, it's an acronym headed by a plural, so it's plural, no? There doesn't need to be an overt -s if the acronym is an initialism.

nschneid commented 11 months ago

I can't see the "DDD" sentence but I would expect number on an acronym to be defined by agreement with the acronym, which in practice is probably determined by how the head of the acronym would be expanded (taking into account any suffix that may or may not be explicit). So singular for "the POS is..." and plural for "the POS are" (I don't say this but I know people who do) as well as "the POSes are".

rhdunn commented 11 months ago

The "DDD" sentence is equivalent to something like "These Automated Teller Machines (ATM) work with other banks.". In which case, I see why it is plural in this instance. My lemma checker is indicating it needs Abbr=Yes, though.

amir-zeldes commented 11 months ago

Sure, will add Abbr