mmcdermott / MEDS_transforms

A simple set of MEDS polars-based ETL and transformation functions
MIT License
19 stars 5 forks source link

Various transforms that rely on the `metadata/codes.parquet` file may require all codes present in the data to be in that dataframe. #116

Open mmcdermott opened 3 months ago

mmcdermott commented 3 months ago

This requirement is not currently guaranteed, nor is it the case that operations that add new codes necessarily add those codes to the metadata/codes.parquet file.

More challengingly, operations that add new codes will, largely, be data transforms, not metadata transforms, and the system right now is designed for stages to be only one or the other. This may necessitate supporting stages and filepath structures that can simultaneously be data and metadata stages.