dssg / pgdedupe

A simple command line interface to the datamade/dedupe library.
https://pgdedupe.readthedocs.io
Other
42 stars 6 forks source link

Use user-set clustering thresholds #65

Open ecsalomon opened 7 years ago

ecsalomon commented 7 years ago

When assigning final ids, use a user-provided threshold. Better yet, allow the user to pass multiple thresholds, and create either multiple unique_map tables or a longer form unique_map that also has a threshold column.