CODAIT / Identifying-Incorrect-Labels-In-CoNLL-2003

Research into identifying and correcting incorrect labels in the CoNLL-2003 corpus.
Apache License 2.0
12 stars 2 forks source link

Changed label of national team names from LOC -> ORG incompatible with MUC guidelines #43

Open andreasgrv opened 1 year ago

andreasgrv commented 1 year ago

Hi,

Thank you for identifying these errors and releasing them, the explanations file justifying the corrections has been particularly helpful!

I am curious about the changes that affect national team mentions, the labels of which have been changed from LOC to ORG. While this change makes sense to me, it conflicts with the MUC guidelines which state:

A.2.2 Miscellaneous ORG-type Entity-Expressions Miscellaneous types of proper names that are to be tagged as ORGANIZATION include stock exchanges, multinational organizations, political parties, orchestras, unions, non-generic governmental entity names such as "Congress" or "Chamber of Deputies", sports teams and armies (unless designated only by country names, which are tagged as LOCATION),

They also have an example:

"In hockey action, Russia defeated France by a score of 7 to 3." ...Russia defeated France

Were these changes made in agreement with an updated guideline / agreement between annotators?

Thanks!