mmcdermott / MEDS_transforms

A simple set of MEDS polars-based ETL and transformation functions
MIT License
19 stars 5 forks source link

`values/sum_sqd` and possibly `values/sum` may overflow. We should consider adapting the aggregation space to work in the `values/mean` and `values/variance` space instead. #111

Open mmcdermott opened 3 months ago

mmcdermott commented 3 months ago

This would require re-working aggregate_code_metadata.py to support and recognize dependencies between aggregations -- e.g., values/mean depends on values/n_occurrences, and values/variance depends on both values/mean and values/n_occurrences (because when these are computed in a sharded manner you need to maintain the intermediate stats of the shards to compute the true aggregate values during the reduce).