Closed justinmills closed 3 months ago
We have yet to test this change out against the mainline tuva, but we have tested it against a fork of the v0.8.6 branch (with some other pg-specific fixes backported/applied - none of which I believe are required on main
).
Describe your changes
Implements a AWS RDS-centric implementation of a postgres load_seed macro implementation.
This relies on two optional variables set in your dbt project to override the S3 bucket name and optionally provide a prefix.
This assumes you've created a pg-friendly format of the seed data (this repo can be used to generate one). This macro implementation also requires that you have setup your RDS cluster/instance with a IAM Role that has the proper privileges to access the S3 bucket where the seed data is stored.
Also fixed PR template style guide link.
How has this been tested?
We've run this a few times using dbt cloud in individual and our production environments. We have not however done exhaustive testing around data quality.
Reviewer focus
This implementation is AWS+RDS specific, so it is unlikely to work for a postgres instance that is not hosted via AWS' RDS offering. There may be other pg extensions to read data from S3, but those have not been explored or tested.
Checklist before requesting a review
tuva_last_run
to the final outputPackage release checklist
dbt_project.yml
(Optional) Gif of how this PR makes you feel
Loom link