Open shamahutoto opened 3 years ago
Yes, block on it.
@shamahutoto Since there are various types of blocking, I should have been more precise:
Exact blocking on a variable (column), for example gender, makes sure that the variable is an exact match.
It is useful to think of record linkage as a process. You do blocking before the actual record linkage. Typically you use the blockData()
function for the blocking. Please provide an example if you still need help. The main Github page for fastLink https://github.com/kosukeimai/fastLink gives an example.
Disclaimer: I am a regular user, not a developer.
Hi @shamahutoto,
As @aalexandersson mentioned, you can either block on a certain variable. Note that for all the variables that you pass to fastLink that are not listed in stringdist.match
or on numeric.match
, exact matching is used to compare values.
Hope this helps! If anything, let us know.
All my best,
TEd
Two years later, but just to be sure @tedenamorado, this means that if I don't add individuals' birth dates in either stringdist.match
or numeric.match
the algorithm will only try matching individuals (from the two dataframes) that have the same date of birth?
In that sense, it is the same thing as doing an exact block on the date of birth and then running the algorithm on the result? Or did I miss something?
Hi, is there a way to make sure that one column is an exact match?