UniversalDependencies / UD_English-GUM

Other
30 stars 4 forks source link

Mismatch in identity #46

Open martinpopel opened 2 years ago

martinpopel commented 2 years ago

I thought each entity should have the same identity (aka wikification) in all its mentions. This is now being checked by validate.py --coref. GUM files in the current dev branch contain many errors of this type entity-identity-mismatch. For example, entity with eid (aka GRP) 6 appears in sentences GUM_academic_librarians-3 without any identity annotated, but in GUM_academic_librarians-8 it has identity=Vrije_Universiteit_Amsterdam.

amir-zeldes commented 2 years ago

Thanks for catching this! You are right, and there is in fact the exact same validation idea in the GUM build bot, but it appears there is a bug in its implementation. This will be fixed for the next UD release.