Thanks for your lib. I definitively love it. But the is an error when using "add_keywords_from_dict" or add_keywords_from_list". The error is about wrong matching result as you can see below
from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor()
keyword_processor.add_keywords_from_list(["hydro", "fran",'cam'])
keyword_processor.extract_keywords('Le groupe français va concevoir, construire et exploiter une centrale hydroélectrique au Cameroun')
output ['fran', 'hydro']
As you can see, it's as if he truncates an accent character. This can be resolve by removing accent (by using deaccent from gensim for example), and by using "span_info" in other to recovry the original word from the text at the end.
Thanks for your lib. I definitively love it. But the is an error when using "add_keywords_from_dict" or add_keywords_from_list". The error is about wrong matching result as you can see below
As you can see, it's as if he truncates an accent character. This can be resolve by removing accent (by using deaccent from gensim for example), and by using "span_info" in other to recovry the original word from the text at the end.