pckhoi / datamatch

Utilities for data matching
MIT License
5 stars 0 forks source link

Add deduplicate functionality #2

Closed pckhoi closed 3 years ago

pckhoi commented 3 years ago

Matcher right now can only provide matches between 2 frames. If given only one frame it should deduplicate instead. The usage syntax will change a bit.

When matching 2 frames:

matcher = ThresholdMatcher(index, fields, dfa, dfb)

When deduplicating:

matcher = ThresholdMatcher(index, fields, df)