vi3k6i5 / flashtext

Extract Keywords from sentence or Replace keywords in sentences.
MIT License
5.6k stars 600 forks source link

Lack of reference to fuzzy matching. #114

Open mrkkollo opened 4 years ago

mrkkollo commented 4 years ago

There has been a commit to add support for fuzzy matching using the "max_cost" argument in extract_keywords, however there seems to be no reference to it in the README and the documentation. Currently it feels like many people don't know such a feature is available.

olgnaydn commented 4 years ago

Its not good idea to use flashtext with max_cost argument. We have tested it and it is much slower than fuzzywhuzzy. For fuzzy matching, i would recommend to use fuzzywhuzzy

On Thu, 25 Jun 2020 at 14:47, Marko Kollo notifications@github.com wrote:

There has been a commit to add support for fuzzy matching using the "max_cost" argument in extract_keywords, however there seems to be no reference to it in the README and the documentation. Currently it feels like many people don't know such a feature is available.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vi3k6i5/flashtext/issues/114, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYSQNUMEC4XCXBGNYMZPK3RYNBN7ANCNFSM4OIKV7AA .

remiadon commented 4 years ago

Hi, I implemented the "fuzzyness" feature for flashtext Benchmarks are not included, and I agree it's lacking of documentation.

Amongst other things, there is a need to make it "smarter", and, perhaps, faster.

@olgnaydn do you have an example to provide that makes you argue that fuzzywhuzzy is more suitable when performance matters ? From what I know fuzzywhuzzy is not designed for multi-words matching, but I may be wrong

shivampuri20 commented 3 years ago

hi where i can find max argument