GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
93 stars 15 forks source link

Adding an Option for Removing Turkish Stop Words During Preprocessing #280

Closed irmakyucel closed 3 years ago

irmakyucel commented 3 years ago

Adding a text file having Turkish stop words and an option for removing them during text preprocessing can be useful. It would also benefit sadedegel by making it closer to state of art NLP libraries for English.

I found a work that was done on Turkish stop words (on this link https://github.com/ahmetax/trstop). We can use the text file with Turkish stop words there. Then by using the list of stopwords we can make changes in the code for giving the user an option for possibly removing them during the preprocessing stage.

irmakyucel commented 3 years ago

Closing this issue as it was already implemented by: