Open mrkkollo opened 4 years ago
Its not good idea to use flashtext with max_cost argument. We have tested it and it is much slower than fuzzywhuzzy. For fuzzy matching, i would recommend to use fuzzywhuzzy
On Thu, 25 Jun 2020 at 14:47, Marko Kollo notifications@github.com wrote:
There has been a commit to add support for fuzzy matching using the "max_cost" argument in extract_keywords, however there seems to be no reference to it in the README and the documentation. Currently it feels like many people don't know such a feature is available.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vi3k6i5/flashtext/issues/114, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYSQNUMEC4XCXBGNYMZPK3RYNBN7ANCNFSM4OIKV7AA .
Hi, I implemented the "fuzzyness" feature for flashtext Benchmarks are not included, and I agree it's lacking of documentation.
Amongst other things, there is a need to make it "smarter", and, perhaps, faster.
@olgnaydn do you have an example to provide that makes you argue that fuzzywhuzzy is more suitable when performance matters ? From what I know fuzzywhuzzy is not designed for multi-words matching, but I may be wrong
hi where i can find max argument
There has been a commit to add support for fuzzy matching using the "max_cost" argument in extract_keywords, however there seems to be no reference to it in the README and the documentation. Currently it feels like many people don't know such a feature is available.