eklem / stopword-sami

Sami stopword lists for natural language processing. Examples on use could be search engines, machine learning and chatbots.
MIT License
1 stars 0 forks source link

set up project - apply for money #5

Closed eklem closed 2 years ago

eklem commented 3 years ago

Figure out what's needed

Bigger picture

Quality

Northern Sami will reach a good level first. It has around 4000 articles a year. Southern Sami has 350 articles from mid 2017 until late 2021 and Lule Sámi has 500 articles from mid 2017.

eklem commented 2 years ago

Apply to Sametinget for money: https://sametinget.no/stipend-og-tilskudd/tilskudd/sprak/tilskudd-til-samiske-sprakprosjekter/

eklem commented 2 years ago

Application / project plan should have a step which is assess if stopword list is good enough after actually being generated.

eklem commented 2 years ago

Specify that the intention is for sami speaking people to create solutions based on sami languages. It's not to translate sami content it into other languages.

eklem commented 2 years ago

Markdown to word: https://github.com/xoofx/markdig

eklem commented 2 years ago

Application draft ready. Will make word-version later.