Datavault-UK / automate-dv

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
https://www.automate-dv.com
Apache License 2.0
478 stars 114 forks source link

[FEATURE] add hashdiff and de-duping to effectivity sat macro #223

Open philmaddocks opened 5 months ago

philmaddocks commented 5 months ago

Is your feature request related to a problem? Please describe. The current effectivity sat macro does no de-duping to see if a relationship has changed over time on the initial load. It relies on being provided relationship changes from a dedicated model rather than being able to be created over the typical stage view from a source table that may contain that relationship. It simply loads all the rows from our stage model, including those that feature no relationship change.

Describe the solution you'd like I'd like the macro (as per the other sat macros) to de-dupe over a data set in order to identify actual changes to a relationship, based on the provided keys.

Describe alternatives you've considered Using the standard Sat Macro and having only the effective_from as our "effectivity" i.e. max(effective_from) is the latest relationship record as far as the business is concerned. There are no end_date inserts.

AB#5351