Open jceresearch opened 1 year ago
To think what fuzzy matching we want ... probably we don't want to go too far as we are typically joining keys emails or usernames or some short title (system name) not addresses but we can add some options
Add some metrics on success of each, maybe some indicator column stating the origin
Come up with a "magic merge" feature that would
A) apply hardcode first
B) attempt first hard merge with col1 then col2 etc C) attempt a fuzzy match with each with strict tolerance D) define who wins
Options to pre lower() Options to pre cleanup non a-z 0-9