moj-analytical-services / splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
https://moj-analytical-services.github.io/splink/
MIT License
1.4k stars 151 forks source link

Allow `linker.train_m_from_deterministic_rule()` #402

Open RobinL opened 2 years ago

RobinL commented 2 years ago

Another way of training m is to provide splink with a deterministic rule that is considered valid (i.e. it results in 100% matches), and use this to train the m values for the remaining columns

RossKen commented 1 year ago

Extend method from https://moj-analytical-services.github.io/splink/linker.html#splink.linker.Linker.estimate_m_from_label_column to get desired functionality