csurfer / rake-nltk

Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
https://csurfer.github.io/rake-nltk
MIT License
1.06k stars 150 forks source link

fix: Set defaults tokenizer language to actual language parameter. #70

Closed Peaverin closed 2 years ago

Peaverin commented 2 years ago

Currently, the default tokenizer (nltk.tokenize.sent_tokenize) is not using the language set in the constructor, but the default language as set in nltk.tokenize.sent_tokenize method, which is english (see https://www.nltk.org/api/nltk.tokenize.html#nltk.tokenize.sent_tokenize). This is a simple fix that sets the default tokenizer as the nlkt tokenize function but changing the default language parameter to the one set by the user in Rake constructor.