summanlp / textrank

TextRank implementation for Python 3.
https://pypi.org/project/summa/
MIT License
1.25k stars 260 forks source link

Fix ignoring non-Cyrillic words in RussianStemmer #63

Open KMiNT21 opened 5 years ago

KMiNT21 commented 5 years ago

This fixes bug with processing English words in Russian (Cyrillic) text.

For example, if we use word "life" in text, the result now will be "lif" (wrong).

So the best is just to ignore all non-Cyrillic words (i.e. it can be brand names).