eklem / stopword-sami

Sami stopword lists for natural language processing. Examples on use could be search engines, machine learning and chatbots.
MIT License
1 stars 0 forks source link

Skip images in data-sets #23

Closed eklem closed 2 years ago

eklem commented 2 years ago

Too much noise because it's hard to actually identify the correct photo and if there is a photo connected at all.

eklem commented 2 years ago

Need to adjust

eklem commented 2 years ago

Updated nrk-sapmi-crawler to not scrape img-info. And removed from datasets/content-files.

eklem commented 2 years ago

https://github.com/eklem/stopword-sami/releases/tag/v0.5.0