maxharlow / csvmatch

🔎 Finds fuzzy matches between CSV files
Other
183 stars 22 forks source link

Ignore_letters option #34

Closed hjohns12 closed 7 months ago

hjohns12 commented 4 years ago

For the purpose of matching strings with helpful numbers and unhelpful words (such as precinct names with codes and messy names), adding an "ignore_letters" option would be nice. This would only match numbers from two columns of interest.

I implemented something like this in my code with:

def ignore_alpha(row):
    regex = re.compile('[\D_]+')
    return [regex.sub('', value) for value in row]
maxharlow commented 7 months ago

This can be achieved with the Regex ignore function