sidataplus / demo-etl-sqlmesh-omop

The OMOP implementation on SQLMesh for the demo at the OHDSI Symposium 2024.
1 stars 3 forks source link

Demo ETL with SQLMesh and OMOP

This repository demonstrates the use of SQLMesh for creating an ETL pipeline, transforming data into the OMOP Common Data Model. The ETL process involves extracting, transforming, and loading (ETL) data from different sources into the OMOP CDM structure, which is widely used in the healthcare industry.

Table of Contents

Overview

The project provides an end-to-end ETL pipeline using SQLMesh to manage SQL transformations and handle versioning for SQL models. SQLMesh enables tracking and deploying SQL changes in development and production environments.

Features

Prerequisites

Before running this project, ensure you have the following tools installed:

Installation

Step 1: Clone the Repository

git clone https://github.com/Chinapat0843/demo-etl-sqlmesh-omop.git
cd demo-etl-sqlmesh-omop

Step 2: Install Dependencies

Install the required Python dependencies using Poetry:

poetry install

Step 3: Set Up Environment Variables

Create a .env file to protect your credentials. Add the following variables:

POSTGRES_USER=sqlmesh_user
POSTGRES_PASSWORD=sqlmesh_password
POSTGRES_DB=sqlmesh_db
POSTGRES_HOST=postgres
POSTGRES_PORT=5432

Step 4: Docker Setup

Build and start the Docker containers:

docker-compose up --build

This will set up the PostgreSQL database and launch the SQLMesh application inside a Docker container.

Configuration

The project uses a config.yaml file to configure SQLMesh for different environments (development and production).

Sample config.yaml

gateways:
  local:
    connection:
      type: postgres
      host: postgres
      port: 5432
      database: dev_db
      user: dev_user
      password: dev_password
default_gateway: local

SQLmesh UI

http://localhost:8000