CDCgov / RecordLinker

The RecordLinker is a service that links records from two datasets based on a set of common attributes. The service is designed to be used in a variety of public health contexts, such as linking patient records from different sources or linking records from different public health surveillance systems.
https://cdcgov.github.io/RecordLinker/
Apache License 2.0
2 stars 0 forks source link

Models for new schema #8

Closed ericbuckley closed 1 month ago

ericbuckley commented 2 months ago

Summary

Create a new src/record_linker/models.py module that defines the following schema using the SQLAlchemy ORM.

Acceptance Criteria

Details / Tasks

Schema

erDiagram
    Person {
        int id
        uuid internal_id
    }

    ExternalPerson {
        int id
        int person_id 
        string source
        string external_id
    }

    Patient {
        int id
        int person_id
        string data
    }

    BlockingKey {
        int id
        int patient_id
        string key
        string value
    }

    Person ||--o{ ExternalPerson: "has"
    Person ||--o{ Patient : "has"
    Patient ||--o{ BlockingKey : "has"

We'll also need a new way to manage migrations for the new schema, as pyway doesn't support Microsoft SQL Server. For this, we should use Alembic since it has strong support for SQLAlchemy and all the databases it supports.

Create a new migration environment (we should be able to continue to use the /migrations directory, as pyway and alembic will read that differently) for the project, and generate a migration for the new ORM created. The migrations need to be able to use the existing MPI_* environment variables that are defined in .env for knowing which database to run the migrations on.

Dependencies