Weird edge case, but when calling linker.predict with a threshold_match_weight of 0 to remove negative weighted records, the threshold isn't applied and it returns all records. There is a logical check on threshold_match_weight but it should check that threshold_match_weight is not None, to handle the case it may be set to zero.
Thanks for the report! If you are happy to feel free to put in a PR with the fix for this.
Otherwise someone from the team will have a look into it when there's some space to do so
What happens?
Weird edge case, but when calling linker.predict with a
threshold_match_weight
of 0 to remove negative weighted records, the threshold isn't applied and it returns all records. There is a logical check onthreshold_match_weight
but it should check thatthreshold_match_weight
is not None, to handle the case it may be set to zero.https://github.com/moj-analytical-services/splink/blob/868fefba022483f4cb0d05822aa55d62ee9b92fa/splink/predict.py#L58
To Reproduce
From the rough and ready example, last two lines of interest.
OS:
Windows 10
Splink version:
3.9.8
Have you tried this on the latest
master
branch?Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?