dataculturegroup / DataBasic

A suite of focused and simple tools and activities for journalists, data journalism classrooms and community advocacy groups
http://www.databasic.io/
MIT License
62 stars 16 forks source link

Create albanian #383

Closed edvinmolla closed 4 years ago

edvinmolla commented 4 years ago

Added albanian stopwords.

rahulbot commented 4 years ago

Hi, and thanks for this. However, we don't support Albanian right now. While this file is useful, it won't be used unless we supported Albanian at the UI level. Actually I'm not even sure these static stop word files are being used - I'd have to do more code tracing to remind myself.

Relevant code references:

edvinmolla commented 4 years ago

Hi, and thanks for this. However, we don't support Albanian right now. While this file is useful, it won't be used unless we supported Albanian at the UI level. Actually I'm not even sure these static stop word files are being used - I'd have to do more code tracing to remind myself.

Relevant code references:

* we send in that language to NLTK in [`databasic.logic.wordhandler._count_words`](https://github.com/mitmedialab/DataBasic/blob/deb4dde5fbb7b02f2ed5b91aef36fce454a5e672/databasic/logic/wordhandler.py#L42)

* the lookup from language code to NLTK stop words language key is in [`databasic. NLTK_STOPWORDS_BY_LANGUAGE`](https://github.com/mitmedialab/DataBasic/blob/d37adc5735c42e72434f501d51180a64700f74d2/databasic/__init__.py#L18)

Thanks for the links.

rahulbot commented 4 years ago

Closing because we don't have plans to support Albanian through the interface. Happy to reopen if we identify a workshop partner