Project that aims to sentenize all the open data of Riksdagen and other sources to create an easily linkable dataset of sentences that can be refered to from Wikidata lexemes and other resources
GNU General Public License v3.0
0
stars
0
forks
source link
Guard against mismatch between token language id and sentence language id #24
This should be done before inserting a link in the sentence_rawtoken_linking table