Closed georgh closed 4 years ago
Hi @georgh,
Comparing a matched sub-string with the original search string is just comparing two similar strings in terms of Levenshtein distance. That is already solved well by other libraries. Here are two examples:
Considering the above, I don't think it's worth the effort to add this to fuzzysearch. However, I think a good example showing how to do this would be a great addition to the docs - example code or a PR would be very welcome!
I'm closing this since I currently don't see a real need to add such a feature, and there's been no further response from the poster (@georgh). Feel free to continue the discussion if needed, and I'll re-open the issue if necessary.
I think it would be handy to have a field describing the fuzzy part of the match. You already report the distance, but it is in somecases a bit cumbersome to finde the exact positions where the fuzzy part happend.
So for example:
find_near_matches('I love you', 'I luve yuu XXXXXXX', max_l_dist=5)
returns aMatch(start=0, end=10, dist=2, matched='I luve yuu')
Now it would be nice to know the fuzzy positions: [3,8] in this caseIt becomes a bit tricky, if parts are missing or inserted - so maybe that could be reported seperatly?
find_near_matches('I love you', 'I luve yu XXXXXXX', max_l_dist=5)
-> fuzzy_match = [3], fuzzy_missing=[(o,8)]What do you think about it?