dpriskorn / odsc

Project that aims to sentenize all the open data of Riksdagen and other sources to create an easily linkable dataset of sentences that can be refered to from Wikidata lexemes and other resources
GNU General Public License v3.0
0 stars 0 forks source link

Store unique NER entities per sentence #13

Closed dpriskorn closed 8 months ago

dpriskorn commented 8 months ago

To know what a document or sentence is about is valuable. It enables us to find document that mention a specific name or country if the NER entities can be linked to e.g. Wikidata in a later step.

The data could also be used to improve aliases in Wikidata