UniversalDependencies / UD_English-GUM

Other
30 stars 4 forks source link

Mismatch in entity type #45

Open martinpopel opened 2 years ago

martinpopel commented 2 years ago

I thought each entity should have the same etype in all its mentions. However, when loading the newest data from the dev branch into Udapi (udapy corefud.Load < en_gum-ud-train.conllu), I get the following warnings:

load_coref_from_misc - etype mismatch in <GUM_conversation_blacksmithing-33#19, lab>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_conversation_blacksmithing-34#8, it>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_conversation_christmas-11#2, they>: person != object
load_coref_from_misc - etype mismatch in <GUM_conversation_erasmus-69#11, his>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_interview_cocktail-25#3, the>: event != abstract
load_coref_from_misc - etype mismatch in <GUM_news_warming-9#19, climate>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_news_warming-13#41, climate>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_news_warming-20#1, Climate>: abstract != event
load_coref_from_misc - etype mismatch in <GUM_voyage_lodz-4#16, the>: organization != abstract
load_coref_from_misc - etype mismatch in <GUM_voyage_lodz-14#30, the>: organization != abstract
load_coref_from_misc - etype mismatch in <GUM_voyage_lodz-32#37, its>: organization != abstract
load_coref_from_misc - etype mismatch in <GUM_whow_packing-33#23, there>: object != place
load_coref_from_misc - etype mismatch in <GUM_whow_packing-46#13, your>: object != place
load_coref_from_misc - etype mismatch in <GUM_whow_packing-48#6, the>: object != place
amir-zeldes commented 2 years ago

Thanks for reporting! That's odd... The GUM build bot validator should have caught these. @yilunzhu - can you take a break from the tokenizer module and debug how these cases got past the validator? The warning should have been triggered during building here:

https://github.com/amir-zeldes/gum/blob/dev/_build/utils/validate.py#L614