csurfer / rake-nltk

Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
https://csurfer.github.io/rake-nltk
MIT License
1.06k stars 150 forks source link

Add Brazilian Portuguese language support #6

Closed johnidm closed 7 years ago

johnidm commented 7 years ago

Hi @csurfer

What do you think of including support for Brazilian Portuguese?

I can to do this :-)

csurfer commented 7 years ago

@johnidm: As long as nltk has stopwords for that language, punctuation information and tokenizer, using it with any language should be the same. You can raise a pull request to ensure the program handles different languages gracefully.

Forcing download of any language apart from en (English) or es (Spanish) seems forced. We can create a param in class constructor to take the language information and let the users download what ever language pickle they want. Just my two cents.

johnidm commented 7 years ago

Thanks for information @cgratie

I will start to work on

johnidm commented 7 years ago

Examples for Portuguese Processing - http://www.nltk.org/howto/portuguese_en.html