MycroftAI / adapt

Adapt Intent Parser
Apache License 2.0
712 stars 154 forks source link

Approximate string matching support #17

Open baudren opened 8 years ago

baudren commented 8 years ago

Is there a built-in mechanism to support fuzzy string matching (within a certain range). For instance, when having a keyword "from", to be able to detect that "form" could be a match, if no exact match is found?

clusterfudge commented 8 years ago

I was going to say yes, but then sadly have to say "not right now." The Trie implementation supports querying with edit distance, but I have not plumbed that all the way out.

https://github.com/MycroftAI/adapt/blob/master/adapt/tools/text/trie.py#L112

I'd be happy to review a PR that makes the changes necessary to expose edit distance, and a test or two flexing it. It should be noted that fuzzy matching in this way can dramatically affect performance with a large number of entities.

wolfv commented 8 years ago

Maybe you'd be interested in the Android Keyboard Fuzzy String matching? I've ported it over to a standalone C++ implementation (based on the work from the Chromium Mojo team).

I think it offers great string matching (for typed input): https://github.com/wolfv/dbus_type_correction

Sudo-Kid commented 8 years ago

To add to what wolfv has suggested. There is also a python fuzzy string matching library that maybe worth looking at as well.

https://github.com/seatgeek/fuzzywuzzy