dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
298 stars 120 forks source link

Use metadata for _dbt_copied_at in Snowpipe #282

Closed jtmcn closed 5 months ago

jtmcn commented 5 months ago

Description & motivation

resovles: #281

This change uses metadata$start_scan_time instead of current_timestamp for the _dbt_copied_at field on Snowflake Snowpipe creation.

This is the method recommended in the Snowflake docs

Checklist

jtmcn commented 5 months ago
  1. can you confirm that start_scan_time is available in non-Snowpipe COPY INTO statements? Querying Metadata for Staged Files | Snowflake Documentation makes me think so, but want to double check

Yes, the metadata fields are available when the query target is a Snowflake Stage

  1. can imagine that this might constitute a breaking change for anyone? My gut tells me "no", but maybe someone who has a downstream model with logic that depends on this _dbt_copied_at column might see wildly different times?

No, I don't think this constitutes a breaking change. The new value won't be used until the Snowpipe is recreated. There won't be wild differences within the same table. Also, the documentation says the

CURRENT_TIMESTAMP is evaluated when the load operation is compiled in cloud services rather than when the record is inserted into the table

The existing value for the _dbt_copied_at field is likely to be incorrect for it's intended downstream purpose.