maxharlow / csvmatch

🔎 Finds fuzzy matches between CSV files
Other
183 stars 22 forks source link

feature request: add wrong whitespaces option #29

Closed aborruso closed 5 years ago

aborruso commented 5 years ago

Hi, using these input files

Name,Age
Andy,32
Mary Jane,43

Name,City
Andy,Rome
Mary  Jane,New York

"Mary Jane" does not match "Mary Jane" because in the first there are two spaces. Probably I can use -l option, but I do not know how to do it. If it's not possible with l, it would be great to have ignore wrong white space option, to strip leading and trailing whitespace, and replace multiple whitespace with singles.

Thank you

aborruso commented 5 years ago

Hi, it's possible with -l. Great!!

I have built this rule file

 +
^ 
 $

and running

csvmatch -i -a -n -l rule.txt input_01.csv input_02.csv --fields1 "Name" --fields2 "Name"

it works!!

Great

aborruso commented 5 years ago

@maxharlow I'm reopening it, because I think it could be a good feature to add as standard option

maxharlow commented 5 years ago

Glad you got it worked out! I agree this could be a good future standard option

maxharlow commented 5 years ago

This is fixed with v1.19. There is no option to ignore whitespace specifically, but ignoring alphanumeric characters now includes whitespace.

aborruso commented 5 years ago

Wow, I'm very proud of it :)