This PR adds a new script deploy_dbt_model_dependencies.sh that downloads dependencies for dbt Python models, bundles them into zip archives, and pushes them to S3. It also updates the build-and-test-dbt workflow to use the new script, including the following features:
If a dbt cache exists, only deploy packages for models that are new or have changed
If a caller has dispatched the workflow for a specific set of models, only deploy packages for those models
In all cases, only deploy package bundles that are new or have changed
Deploy a separate package bundle for each model
Deploy package bundles to separate S3 locations for dev, CI, and prod, and isolate dev/CI environments by user or branch
Note that this PR leaves out of scope the additional step to refactor the existing reporting.ratio_stats model to use this new dependency management system. That step will be handled in a follow-up issue (https://github.com/ccao-data/data-architecture/issues/439).
This PR adds a new script
deploy_dbt_model_dependencies.sh
that downloads dependencies for dbt Python models, bundles them into zip archives, and pushes them to S3. It also updates thebuild-and-test-dbt
workflow to use the new script, including the following features:Note that this PR leaves out of scope the additional step to refactor the existing
reporting.ratio_stats
model to use this new dependency management system. That step will be handled in a follow-up issue (https://github.com/ccao-data/data-architecture/issues/439).Closes https://github.com/ccao-data/data-architecture/issues/417.
Testing
Evidence of
build-and-test-dbt
pushing packages for changed models to S3 onsynchronize
event: LinkEvidence of
build-and-test-dbt
skipping dependencies that have not changed onsynchronize
event: LinkEvidence of
build-and-test-dbt
skipping models that have no Python dependencies onworkflow_dispatch
event: LinkEvidence of
cleanup-dbt-resources
removing packages from S3 onclosed
event: Link