The RecordLinker is a service that links records from two datasets based on a set of common attributes. The service is designed to be used in a variety of public health contexts, such as linking patient records from different sources or linking records from different public health surveillance systems.
Create a new src/record_linker/models.py module that defines the following schema using the SQLAlchemy ORM.
Acceptance Criteria
[x] README has clear instructions on how to initialize a new database
[x] Python classes representing the tables outlined before
[x] Alembic migrations defined in migrations/ for initializing the database
Details / Tasks
Schema
erDiagram
Person {
int id
uuid internal_id
}
ExternalPerson {
int id
int person_id
string source
string external_id
}
Patient {
int id
int person_id
string data
}
BlockingKey {
int id
int patient_id
string key
string value
}
Person ||--o{ ExternalPerson: "has"
Person ||--o{ Patient : "has"
Patient ||--o{ BlockingKey : "has"
We'll also need a new way to manage migrations for the new schema, as pyway doesn't support Microsoft SQL Server. For this, we should use Alembic since it has strong support for SQLAlchemy and all the databases it supports.
Create a new migration environment (we should be able to continue to use the /migrations directory, as pyway and alembic will read that differently) for the project, and generate a migration for the new ORM created. The migrations need to be able to use the existing MPI_* environment variables that are defined in .env for knowing which database to run the migrations on.
Summary
Create a new
src/record_linker/models.py
module that defines the following schema using the SQLAlchemy ORM.Acceptance Criteria
migrations/
for initializing the databaseDetails / Tasks
Schema
We'll also need a new way to manage migrations for the new schema, as pyway doesn't support Microsoft SQL Server. For this, we should use Alembic since it has strong support for SQLAlchemy and all the databases it supports.
Create a new migration environment (we should be able to continue to use the
/migrations
directory, as pyway and alembic will read that differently) for the project, and generate a migration for the new ORM created. The migrations need to be able to use the existingMPI_*
environment variables that are defined in.env
for knowing which database to run the migrations on.Dependencies