Closed edvinmolla closed 4 years ago
Hi, and thanks for this. However, we don't support Albanian right now. While this file is useful, it won't be used unless we supported Albanian at the UI level. Actually I'm not even sure these static stop word files are being used - I'd have to do more code tracing to remind myself.
Relevant code references:
databasic.logic.wordhandler._count_words
databasic. NLTK_STOPWORDS_BY_LANGUAGE
Hi, and thanks for this. However, we don't support Albanian right now. While this file is useful, it won't be used unless we supported Albanian at the UI level. Actually I'm not even sure these static stop word files are being used - I'd have to do more code tracing to remind myself.
Relevant code references:
* we send in that language to NLTK in [`databasic.logic.wordhandler._count_words`](https://github.com/mitmedialab/DataBasic/blob/deb4dde5fbb7b02f2ed5b91aef36fce454a5e672/databasic/logic/wordhandler.py#L42) * the lookup from language code to NLTK stop words language key is in [`databasic. NLTK_STOPWORDS_BY_LANGUAGE`](https://github.com/mitmedialab/DataBasic/blob/d37adc5735c42e72434f501d51180a64700f74d2/databasic/__init__.py#L18)
Thanks for the links.
Closing because we don't have plans to support Albanian through the interface. Happy to reopen if we identify a workshop partner
Added albanian stopwords.