Medical-Event-Data-Standard / meds_etl

A collection of ETLs from common data formats to Medical Event Data Standard
Apache License 2.0
16 stars 3 forks source link

Feature/duckdb with sharding #19

Closed scottfleming closed 2 months ago

scottfleming commented 2 months ago

Adds DuckDB backend for convert_flat_to_meds. Faster than polars.

Unifies modifications from:

EthanSteinberg commented 2 months ago

@scottfleming Oh, you should fix the tests first. I think you need to install duckdb in the testing environment. See https://github.com/Medical-Event-Data-Standard/meds_etl/blob/main/.github/workflows/python-test.yml#L27

We probably also want a better error message if duckdb isn't installed.