ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
5 stars 3 forks source link

Add script and update `build-and-test-dbt` workflow to push dbt Python dependencies to S3 #435

Closed jeancochrane closed 1 month ago

jeancochrane commented 1 month ago

This PR adds a new script deploy_dbt_model_dependencies.sh that downloads dependencies for dbt Python models, bundles them into zip archives, and pushes them to S3. It also updates the build-and-test-dbt workflow to use the new script, including the following features:

Note that this PR leaves out of scope the additional step to refactor the existing reporting.ratio_stats model to use this new dependency management system. That step will be handled in a follow-up issue (https://github.com/ccao-data/data-architecture/issues/439).

Closes https://github.com/ccao-data/data-architecture/issues/417.

Testing

Evidence of build-and-test-dbt pushing packages for changed models to S3 on synchronize event: Link

Evidence of build-and-test-dbt skipping dependencies that have not changed on synchronize event: Link

Evidence of build-and-test-dbt skipping models that have no Python dependencies on workflow_dispatch event: Link

Evidence of cleanup-dbt-resources removing packages from S3 on closed event: Link

jeancochrane commented 1 month ago

Closing this to test the cleanup step.

jeancochrane commented 1 month ago

~@dfsnow Want to give this one more look before I merge?~ Never mind, needs one more change!