seatgeek / fuzzywuzzy

Fuzzy String Matching in Python
http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/
GNU General Public License v2.0
9.21k stars 874 forks source link

Disguised letter combinations #260

Open tjhutton87 opened 4 years ago

tjhutton87 commented 4 years ago

Would it be possible to include an optional parameter to check each letter of a searched string against a disguised letter dictionary including combinations such as "rn" ("r" and "n") being substituted for "m"?

For example, I want to search for the word "meal", but I want to include results that match "rneal".

Possible disguised letter combinations include but aren't limited to:

'l'o' = b 'o'l' = d 'r'n' = m 'r'i' = n 'n'n' = m 'v'v' = w

maxbachmann commented 4 years ago

Hm my personal approach towards this problem would be to use something like the following:

def disguised_letter_fix(s):
    replacements = {
        'vv': 'w',
        'nn': 'm'
    }
    pattern = re.compile(r'(' + '|'.join(replacements.keys()) + r')')
    result = pattern.sub(lambda x: replacements[x.group()], s)
    return fuzzywuzzy.utils.default_process(s)

fuzzywuzzy.process.extractOne(a, b, processor=disguised_letter_fix)