Closed bsweger closed 7 months ago
Let's scope this work to the "new object created event." If it seems like a good way to proceed, the step would be code the corresponding action when a model-output file is deleted.
Keeping some notes on this experiment here: https://reichlab.atlassian.net/wiki/spaces/RLD/pages/13631576/Automated+model-output+transforms
This is done--I gave @annakrystalli a demo on how it works and we agreed that we should proceed with the use of AWS event notifications + a lambda function to handle conversion of the model-output files.
I tried (and failed) to record the demo, but can do it at the next dev meeting for anyone interested.
There was some conversation here that resulting in the conclusion that we should not rely on GitHub CI actions for triggering the conversion of incoming model-output files to parquet format.
As a next step, I'd like to explore using S3 event notifications as way to invoke actions when a model-output file is written to a hub's S3 bucket.
Specifically, these notifications:
At a high level, the idea is to invoke our prototype "transform model-output file to parquet" function automatically, whenever a model-output file is uploaded to S3 (this happens via GitHub action).
model submission PR merged -> model-output data syncs to S3 -> S3 "new object created" event triggers an AWS lambda version of the "convert data to S3 function"
Definition of done:
s3://hubverse-cloud/raw/model-output
s3://hubverse-cloud/model-output
hubData
s3://hubverse-cloud/raw/model-output
triggers the same data conversion process as uploading a new fileThe AWS resources for this will be created manually (i.e., no need to incorporate into our infrastructure as code process unless we decide this solution will work for us).