Open spooknik opened 4 years ago
In my project I use SpaCy to classify named entities in my data. In order to recognize "custom" entities properly I would like to match against a dictionary using fuzzywuzzy. When I get the "matched phrase" back as a matching result, I can use this information to create an entity from it. Now, I have to build custom logic in order to get the matched phrase which is obviously not that efficient.
Thanks for the reply and explaining your workflow.
For my purposes, I just want a match so I can replace the found phrase with the search phrase. I found that fuzzy-search does exactly what I wanted. It will return the matched phrase in start and end indices which I can extract and use in the replace function.
term = "New York Jets"
text = "New york Jets are a sportball team."
matches = find_near_matches(term, text, max_l_dist=max_distance)
phrase = ([text[m.start:m.end] for m in matches])
print(phrase)
['New york Jets']
...
Would be really handy to be able to return the matched phrase from the extract functions.
For example: