moj-analytical-services / splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
https://moj-analytical-services.github.io/splink/
MIT License
1.36k stars 148 forks source link

`compare_two_records` needs to check whether tf tables exist #802

Open RobinL opened 2 years ago

RobinL commented 2 years ago

See https://github.com/moj-analytical-services/splink/discussions/769_

Hello! I've been trying to make use of the comparing two records feature, but I keep getting the following error: does anyone have any ideas what I'm doing wrong/ what to fix?

Screen Shot 2022-09-13 at 2 43 06 PM Screen Shot 2022-09-13 at 2 42 54 PM

Originally posted by @mmagoffin-sd in https://github.com/moj-analytical-services/splink/discussions/769

RossKen commented 1 year ago

@ThomasHepworth has code from a similar issue in a PR (will be linked below)

ThomasHepworth commented 1 year ago

I believe this is partially resolved in here.

This should be relatively simple to resolve if we can reuse the code implemented above.

RobinL commented 1 year ago

It's harder than it looks because if the user just wants to compare two records, there's no guarantee they've provided Splink with a full input datasets. They might have just loaded in two records. If there is no full input dataset, we can't compute the tf tables. Instead, the user would have to provide them.

James-Osmond commented 1 year ago

Getting the same error trying to compare a record pair, without tf tables available

RobinL commented 7 months ago

Fixed in Splink4, see https://github.com/moj-analytical-services/splink/pull/2111