ScalefreeCOM / datavault4dbt

Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including the Staging Area, DV2.0 main entities, PITs and Snapshot Tables.
https://www.scalefree.com/
Apache License 2.0
143 stars 27 forks source link

[FEATURE] Extract multiple columns from one prejoined object #287

Open tkirschke opened 1 week ago

tkirschke commented 1 week ago

What additional value does this feature bring to the project?

If I want to extract multiple columns from one prejoined object, e.g., for composite Business keys, Right now I need to define two separate prejoined columns.

Is your feature request related to a problem? Please describe.

I'm always frustrated when I need multiple columns from a prejoined object.

Describe the solution you'd like

Right now, extracting multiple columns from the same object would look like this:

  prejoined_columns:
      businessid:
          ref_model: 'business_raw'
          bk: 'ID'
          this_column_name: 'ContractId'
          ref_column_name: 'ContractId'
      businessnumber:
          ref_model: 'business_raw'
          bk: 'number'
          this_column_name: 'ContractId'
          ref_column_name: 'ContractId'

I would like to have an additional, alternative way of defining prejoined columns:

  prejoined_columns:
      - extract_columns: 
           - id
           - number
        aliases:
           - businessid
           - businessnumber
        ref_model: 'business_raw'
        this_column_name: 'ContractId'
        ref_column_name: 'ContractId'

aliases should be optional, but if given, must be the same length as _extractcolumns. _this_columnname & ref_column_name should still allow multiple columns _refmodel should still be changeable to src_name & src_table

tkiehn commented 1 day ago

The feature is so far implemented in the linked branch.

An optional addition could be to warn, if the same aliases are used for different columns. The current implementation silently ignores one of the columns