currently it seems that in the keywords extraction process, stop words are hard coded to be for English language. Thus, when archiving content in some other language, the selected keywords are very often stop words in that language (I mainly archive content in French...)
Maybe the list of stop words could be selected dynamically, based on automatic language detection ? (see https://github.com/Mimino666/langdetect for example)
Hello,
currently it seems that in the keywords extraction process, stop words are hard coded to be for English language. Thus, when archiving content in some other language, the selected keywords are very often stop words in that language (I mainly archive content in French...)
Maybe the list of stop words could be selected dynamically, based on automatic language detection ? (see https://github.com/Mimino666/langdetect for example)
Thanks for great product :)