juand-r / entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
MIT License
1.51k stars 247 forks source link

A Knowledge Graph resource of NER datasets #22

Open jd-coderepos opened 2 years ago

jd-coderepos commented 2 years ago

Dear authors, this repository is such a great resource! Many thanks for creating it. I would like to suggest that maybe the Open Research Knowledge Graph (https://orkg.org/) could be leveraged to enlist such resources for persistence and knowledge sharing. Please find below some resources I created related to the information in this repository.

Named Entity Recognition Tasks in the MUC series

https://orkg.org/comparison/R162797/

NER in the Automatic Content Extraction (ACE) Series

https://orkg.org/comparison/R162851/

Named Entity Recognition in the CoNLL Series and the OntoNotes corpus as a related resource

https://orkg.org/comparison/R166315/

Named Entity Recognition Based on Wikipedia

https://orkg.org/comparison/R166240/

A comparison of the annotated resources of software mentions in scholarly articles

https://orkg.org/comparison/R166560/

NLP Datasets for Named Entity Recognition and Relation Extraction from Biomedicine Scholarly Articles

https://orkg.org/comparison/R163265/

Comparisons and Visualizations of the CrossNER Benchmark Corpus for its Source and Target Domains

https://orkg.org/comparison/R163843/

Surveying BioNLP Shared Tasks Corpora for Named Entity Recognition

https://orkg.org/comparison/R165702/

Surveying BioCreAtIvE Shared Tasks Corpora for Named Entity Recognition

https://orkg.org/comparison/R172155/


The benefits of such machine-encoded data is that Reviews can be automatically created thereby.

Surveying the BioCreAtIvE Shared Task Series

https://orkg.org/review/R172166

Surveying the BioNLP Shared Task Series

https://orkg.org/review/R165924

I would be very happy to offer support in this direction. :)

AngledLuffa commented 2 years ago

@juand-r would you like to look at this? i don't feel that level of ownership over this repo yet :)

jd-coderepos commented 2 years ago

Just as a trailing thought... Please note knowledge graph representations of such rich data enable querying and persistence. They are also in accord in FAIR guiding principles to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets https://www.go-fair.org/fair-principles/ :)

juand-r commented 2 years ago

This looks very useful! Feel free to add any of the datasets listed here to https://orkg.org/