Closed aaronsteers closed 1 year ago
@gwenwindflower - I've scaled this way back so the PR now just focuses on EL.
I still have some work to do to make sure env vars and database paths are all aligned. Once CI passes, this should be ready for review though.
CI Pipelines are now succeeding, starting with: https://github.com/dbt-labs/jaffle-shop-template/actions/runs/4558958279/jobs/8042391887
@gwenwindflower - Do you squash PRs as a rule? We definitely would not want to merge commit this PR, since it has so many commits. Happy to squash on my side if helpful.
Also, my latest commit here https://github.com/dbt-labs/jaffle-shop-template/pull/9/commits/90de06597aea6a2b97723bec230f547ebf1f6494 removes the raw data files and the dynamic behavior toggle for external tables, instead assuming the raw data has already landed. Once I removed external table support, I was able to change the default raw schema for the tap to the same as you had as the default elsewhere - now 'jaffle_raw'
instead of 'tap_jaffle_shop'
I had previously.
I'm happy to revert and bring those back, but I thought I'd just include those deletions here so it is certain that the pipeline data is coming from the EL process and not the seed CSV files.
Pipeline is green again, ready for review. Note that 1 year of data takes approximately 3 minutes and total CI duration is still <5 minutes in total:
@gwenwindflower - I found a small performance optimization that will significantly boost tap performance.
I'll bump the tap version once that update is available and I will update the postCreateScript as you proposed.
Will post back here with updated perf numbers when that's done. 👍
Changes included in this PR
meltano
feature.tap-jaffle-shop
Singer tap and thetarget-duckdb
Singer target.Sample usage (more examples in the updated
README.md
):Or using the equivalent
'el'
job name: