This repository demonstrates the use of SQLMesh for creating an ETL pipeline, transforming data into the OMOP Common Data Model. The ETL process involves extracting, transforming, and loading (ETL) data from different sources into the OMOP CDM structure, which is widely used in the healthcare industry.
The project provides an end-to-end ETL pipeline using SQLMesh to manage SQL transformations and handle versioning for SQL models. SQLMesh enables tracking and deploying SQL changes in development and production environments.
.env
for credential security.Before running this project, ensure you have the following tools installed:
git clone https://github.com/Chinapat0843/demo-etl-sqlmesh-omop.git
cd demo-etl-sqlmesh-omop
Install the required Python dependencies using Poetry:
poetry install
Create a .env
file to protect your credentials. Add the following variables:
POSTGRES_USER=sqlmesh_user
POSTGRES_PASSWORD=sqlmesh_password
POSTGRES_DB=sqlmesh_db
POSTGRES_HOST=postgres
POSTGRES_PORT=5432
Build and start the Docker containers:
docker-compose up --build
This will set up the PostgreSQL database and launch the SQLMesh application inside a Docker container.
The project uses a config.yaml
file to configure SQLMesh for different environments (development and production).
config.yaml
gateways:
local:
connection:
type: postgres
host: postgres
port: 5432
database: dev_db
user: dev_user
password: dev_password
default_gateway: local
http://localhost:8000