dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
305 stars 122 forks source link

"mart-erialized" external tables? #74

Closed dataders closed 1 year ago

dataders commented 3 years ago

Describe the feature

I feel weird even saying this out loud, but after hearing this this package is @jtcohen6's "house in the country", I realized my schnapsidee just might belong after all. So here goes.

What if in the external config dict, there was a materialized key that defaults to false, but if true would stage the external table temporarily only to call the following, then drop the external table?

SELECT * INTO source_table from temp_external_table

In this way, ext_full_refresh would drop the materialized table and reload it, something we'd do on a daily basis before calling dbt run in our production pipeline.

The benefit of extending (read: perverting) the elegance of stage_external_sources is that it ensures that downstream tables can be run and queried quickly while still keeping a link to the external source that can be called again when needed.

However, brain is split because:

Describe alternatives you've considered

We're also considering either:

Additional context

Both Azure Synapse and Azure SQL support the ideal of external tables in the more familiar sense of flat files in blob or s3 (though unfortunately, with different syntax... ). But, Azure SQL also supports cross database queries (a.k.a. "vertically partitioned elastic queries" a.k.a. external data sources of type RDBMS) which are basically the Azure equivalent of the classic SQL Server idea of "linked servers". Enter the idea of using dbt to build data marts.

We're thinking a lot about the idea of marts that are downstream of our core data warehouse for the following reasons:

Who will this benefit?

This would obvs benefit our team, and also Azure stack folks. But it looks like Redshift also has a concept of cross-database queries?

And perhaps anyone else with a handful of the bullet points above?

Are you interested in contributing this feature?

@chaerinlee1 and I have toying around this the idea with #73 but it's still a work in progress. cc: @nandanhegde15

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.