dpriskorn / odsc

Project that aims to sentenize all the open data of Riksdagen and other sources to create an easily linkable dataset of sentences that can be refered to from Wikidata lexemes and other resources
GNU General Public License v3.0
0 stars 0 forks source link

Count all accepted tokens per sentence and store it #29

Open dpriskorn opened 10 months ago

dpriskorn commented 10 months ago

This enables a total token count similar to KORP Add new column tokens in sentence Add new sql count script that sum this count for a given language