The driver of the hospital master is from Medicare, "Hospital General Information" (DF1). The cross-talk table is from Dartmouth, "hosp_hsa_hrr_2013" (DF2). The task is to link DF1 and DF2 by fuzzy-matching of name, address, etc.
Questions: both tables have provider_ID, a 5-digit unique ID. A quick analysis shows:
DF1 has 4,662 IDs, 160 of which are not matched (96.6% match rate)
DF2 has 4,805 IDs, 303 of which are not matched (93.7% match rate)
If this is the case, are we applying the fuzzy match algorithm to the unmatched records only?
Thanks,
The driver of the hospital master is from Medicare, "Hospital General Information" (DF1). The cross-talk table is from Dartmouth, "hosp_hsa_hrr_2013" (DF2). The task is to link DF1 and DF2 by fuzzy-matching of name, address, etc.
Questions: both tables have provider_ID, a 5-digit unique ID. A quick analysis shows:
If this is the case, are we applying the fuzzy match algorithm to the unmatched records only? Thanks,