mmcdermott / MEDS_transforms

A simple set of MEDS polars-based ETL and transformation functions
MIT License
19 stars 5 forks source link

Switch codes from categorical to string column types #79

Closed mmcdermott closed 3 months ago

mmcdermott commented 3 months ago

Experiments from both @prenc and @EthanSteinberg suggest that this has a minimal to positive impact on stored filesize on disk and file read/write time, and has better interoperability with other tools. As much as I don't like it aesthetically, we should switch to strings for now and switch back pending observations of performance issues.

mmcdermott commented 3 months ago

This issue is solved and merged into https://github.com/mmcdermott/MEDS_transforms/tree/32_MEDS_v03 ; will be merged into dev then main as other MEDS compatability changes come in.