UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
197 stars 41 forks source link

Incorrect lemmas #484

Closed rhdunn closed 6 months ago

rhdunn commented 7 months ago

incorrect verb morphology

ERROR: Sentence weblog-juancole.com_juancole_20030911085700_ENG_20030911_085700-0022 token 6 -- VBN lemma 'defange' does not match past-participle-verb applied to form 'defanged', expected 'defang'
ERROR: Sentence weblog-blogspot.com_thelameduck_20041119192207_ENG_20041119_192207-0011 token 26 -- VBP lemma 'poss' does not match lowercase-form applied to form 'posses', expected 'posses'
ERROR: Sentence email-enronsent09_01-0015 token 6 -- VBG lemma 'xferr' does not match present-verb applied to form 'xferring', expected 'xfer'
ERROR: Sentence email-enronsent09_01-0071 token 6 -- VBG lemma 'xferr' does not match present-verb applied to form 'xferring', expected 'xfer'
ERROR: Sentence newsgroup-groups.google.com_IndiaNewsWindow_1405662573fa84ed_ENG_20050831_110400-0021 token 1 -- VBG lemma 'blogg' does not match present-verb applied to form 'Blogging', expected 'blog'

less -> little

ERROR: Sentence weblog-blogspot.com_alaindewitt_20060827093500_ENG_20060827_093500-0010 token 29 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-juancole.com_juancole_20030911085700_ENG_20030911_085700-0021 token 10 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-juancole.com_juancole_20030911085700_ENG_20030911_085700-0033 token 15 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-juancole.com_juancole_20040604210986_ENG_20040604_210986-0010 token 16 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-blogspot.com_rigorousintuition_20050518101500_ENG_20050518_101500-0005 token 6 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-blogspot.com_dakbangla_20050210141134_ENG_20050210_141134-0060 token 27 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-blogspot.com_rigorousintuition_20060511134300_ENG_20060511_134300-0158 token 10 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-blogspot.com_alaindewitt_20060924104100_ENG_20060924_104100-0051 token 6 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence weblog-blogspot.com_dakbangla_20050311135387_ENG_20050311_135387-0071 token 16 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence email-enronsent24_01-0059 token 3 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence newsgroup-groups.google.com_alt.animals_0e65f540816d780c_ENG_20041116_124800-0057 token 22 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence newsgroup-groups.google.com_humanities.lit.authors.shakespeare_0018a7697318f71f_ENG_20031006_163200-0027 token 5 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence newsgroup-groups.google.com_humanities.lit.authors.shakespeare_0018a7697318f71f_ENG_20031006_163200-0083 token 5 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence newsgroup-groups.google.com_alt.animals_0084bdc731bfc8d8_ENG_20040905_212000-0184 token 43 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111106153454AAgT9Df_ans-0011 token 5 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108105400AASqPIh_ans-0013 token 16 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108110044AA4rs9f_ans-0010 token 10 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108111312AAq4ETn_ans-0023 token 14 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108110329AAxl1pb_ans-0030 token 12 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108110329AAxl1pb_ans-0036 token 11 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence answers-20111108065616AAKtL2c_ans-0069 token 25 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence reviews-333672-0006 token 9 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence reviews-360937-0005 token 48 -- RBR lemma 'less' does not match lemma-exception applied to form 'less', expected 'little'
ERROR: Sentence reviews-357217-0005 token 3 -- RBS lemma 'least' does not match lemma-exception applied to form 'least', expected 'little'
ERROR: Sentence weblog-blogspot.com_dakbangla_20041028153019_ENG_20041028_153019-0027 token 18 -- RBS lemma 'least' does not match lemma-exception applied to form 'least', expected 'little'

least -> little

ERROR: Sentence weblog-blogspot.com_marketview_20040611132900_ENG_20040611_132900-0008 token 14 -- RBS lemma 'least' does not match lemma-exception applied to form 'least', expected 'little'
ERROR: Sentence reviews-061768-0003 token 18 -- RBS lemma 'least' does not match lemma-exception applied to form 'least', expected 'little'
AngledLuffa commented 7 months ago

Where are we with less and least to little? It sounded like there wasn't much interest in making that change.

rueter commented 7 months ago

Maybe to avoid the little--less vs little--littler dilemma, you should remind us that it is the ADV little that is the lemma of the ADV less and ADV least but not the ADJ little. ;)

rhdunn commented 7 months ago

Does that apply to other adverbial adjectives as well (worst, farthest, furthest, best) , or is it just little that is the exception w.r.t. not lemmatizing the comparative/superlative degree to the positive degree? -- This is so I can update my lemma checker to determine the correct lemma.

nschneid commented 7 months ago

I see that "farther" is JJR or RBR and has "far" as its lemma. It's just that the relationship of "little" to "less"/"least" is a bit iffy or perhaps context-dependent, so GUM and EWT are keeping "less" and "least" as lemmas despite their comparative/superlative status.

rhdunn commented 7 months ago

The CorrectForm for xferring should be transferring, not transfer.

AngledLuffa commented 7 months ago

Thanks, just updated that