CDCgov / RecordLinker

The RecordLinker is a service that links records from two datasets based on a set of common attributes. The service is designed to be used in a variety of public health contexts, such as linking patient records from different sources or linking records from different public health surveillance systems.
https://cdcgov.github.io/RecordLinker/
Apache License 2.0
2 stars 0 forks source link

Implement MPI client functions for new schema #10

Closed ericbuckley closed 1 month ago

ericbuckley commented 2 months ago

Summary

Add the get_block_data and insert_matched_patient functions to a new module, data_access.py for the new schema.

Acceptance Criteria

Details / Tasks

No need to use the existing function signatures for the reimplementation, as there are some readability issues with passing nested lists; going forward, use these signatures

def get_block_data(
    session: orm.Session, 
    data: typing.Dict,
    algo_config: typing.List[typing.Dict],
) -> typing.List[Patient]:
    pass

def insert_matched_patient(
    session: orm.Session,
    data: typing.Dict,
    algo_config: typing.List[typing.Dict],
    person_id: typing.Optional[int],
    external_person_id: typing.Optional[str],
    commit: bool = True,
) -> Patient:
    pass

Notes / Comments

When inserting a patient, abstract the functions necessary for creating the patient row along with creating new blocking keys. Its likely with this design that we'll need a way to generate new blocking keys when an algorithm changes, so keeping that functionality separate will open doors for future expansion.