vi3k6i5 / flashtext

Extract Keywords from sentence or Replace keywords in sentences.
MIT License
5.58k stars 598 forks source link

Fix bug when extract multiple adjacent words from a string without word boundaries #142

Open lishukan opened 1 year ago

lishukan commented 1 year ago

Dear developers: There is no doubt that flashtext is an excellent string matching tool. I have already used it on a large number of occasions. But recently I found it in a string without word boundaries (such as a Chinese sentence),

If two words that need to be extracted happen to be adjacent, then it will only be able to extract the first word.

So I made some modifications: when matching words, the index for the next iteration will start at the end of the last matched word.

I have added a new use case and It passed all unit tests.

image

abulice commented 1 year ago

邮件已收到。

lishukan commented 1 year ago

@vi3k6i5 Hello, dear owner . Is this repo still maintained ? I found that this repo hasn't updated its code for a long time . If it is no longer maintained, I will no longer wait for the merge of the MR.