Medical-Event-Data-Standard / meds_etl

A collection of ETLs from common data formats to Medical Event Data Standard
Apache License 2.0
16 stars 3 forks source link

Draft: Add a duckdb backend to MEDS-ETL #15

Closed EthanSteinberg closed 3 months ago

EthanSteinberg commented 3 months ago

duckdb (https://duckdb.org) is a powerful data processing library.

It turns out that it works extremely well for our MEDS Flat -> MEDS ETL.

This implements a duckdb backend for that ETL, with associated unit tests. It is much much faster than the old ETL.