juand-r / entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
MIT License
1.51k stars 247 forks source link

Worldwide English NER Dataset -- EMNLP 2023 #26

Closed SecroLoL closed 4 months ago

SecroLoL commented 4 months ago

https://arxiv.org/abs/2404.13465

This paper introduces the Worldwide NER dataset for English which includes multiple English contexts from every continent of the world.

Dataset can be found here: https://github.com/stanfordnlp/en-worldwide-newswire

The dataset can be processed using the StanfordNLP Stanza library

AngledLuffa commented 4 months ago

how embarrassing of me!

https://github.com/juand-r/entity-recognition-datasets/commit/f37ee10aa262bb91cb907d4c6b479d07f1345044