dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
294 stars 119 forks source link

Modify the Existing External Table YML definition not altering the External table DDL #307

Open zvijayakumar opened 1 month ago

zvijayakumar commented 1 month ago

Describe the feature

Currently, modifying the existing external table YAML file definition in dbt does not alter the external table according to the new definition. The process should allow for changes in the external table YAML file definition to be reflected in the actual external table without requiring a full refresh of all tables.

Describe alternatives you've considered

  1. Create an External Table: • Create an external table using the external table YAML file definition.
  2. Run dbt Operation: • Use dbt run-operation stage_external_sources to create the external table.
  3. Modify External Table Definition: • Modify the external table YAML file to add a new column. • Running dbt run-operation stage_external_sources only refreshes the external table without altering it to include the new column.
  4. Selective Full Refresh: • If there is any change in the external table definition, running dbt run-operation stage_external_sources --vars "ext_full_refresh: true" should refresh only the modified external table.
  5. Performance Considerations: • The current command dbt run-operation stage_external_sources --vars "ext_full_refresh: true" recreates or replaces all tables, which can degrade performance in Snowflake.

Additional context

This feature request is specific to Snowflake or other databases that support external tables. The aim is to improve performance and efficiency for those building data ingestion pipelines using dbt.

Who will this benefit?

Data Engineers: Those who build data ingestion pipelines will benefit by having a more efficient process that only updates the changed tables or alters the YAML file definition without affecting other tables.