Closed bsweger closed 7 months ago
For lack of a better place to put it, there is now a hubverse-transforms
branch in the infrastructure repo that has Python function for doing the column/format model-output transformations: https://github.com/Infectious-Disease-Modeling-Hubs/hubverse-infrastructure/tree/bsweger/add-model-output-transforms/hubverse-transforms
It's written in Python, which is an out-of-the-box supported AWS Lambda runtime environment (i.e., we'll have an easier path to testing lambda functions triggered by S3 updates, if that's something we want to explore).
Marking this one closed, since the the goal was getting some prototype code up and running as a proof-of-concept for converting incoming model-output data to parquet format.
Those on Confluence can follow along with some WIP experiments using this function: https://reichlab.atlassian.net/wiki/spaces/RLD/pages/13631576/Automated+model-output+transforms
Following the "demos not memos" principle, before making a decision about what will trigger model-output data conversions for cloud usage (#20), it might be helpful to see some preliminary code for these data conversions.
Create a prototype class.function in
hubverse-cloud
that: