A concept and syntax to provide a universal data format, for storing all essential informations, that are needed to implement or generate a data loading process for a data vault model.
Dvpd describes the mapping by taking the source structure and distributing it to the target model instead of describing the target structure and defining the source of every target column.
Why?
pipeline=fetch (increment)+ parse+stage+load or calculate (increment)+ stage+load
the final load process must take a consistent state of the source and distribute it to the target.
the target model often has the same column names and types like the source. Source metadata can be read out and only needs some annotations to derive the target
designing the raw vault is strictly bound to the source data
design of business vault can be broken down into design/implementation of the rule with a tabularized result and to the representation of the rule result in the vault
Why is Dvpd core syntax not designed to represent the full linage?
The assembly of a value/measure/KPI can be arbitrary complex and can't be represented efficiently in a way, that is translatable into executable code. Therefore Dvpd definition begins behind the calculation. Nevertheless, annotational syntax can be added to provide and conserve linage information.
[x] added to concept
[x] added pipeline definition to presentation (physical to physical)
[ ] added maturity development of platforms to presentation
[ ] added separation of design in concept model to dvpd to presentation
Dvpd describes the mapping by taking the source structure and distributing it to the target model instead of describing the target structure and defining the source of every target column.
Why?
Why is Dvpd core syntax not designed to represent the full linage?
The assembly of a value/measure/KPI can be arbitrary complex and can't be represented efficiently in a way, that is translatable into executable code. Therefore Dvpd definition begins behind the calculation. Nevertheless, annotational syntax can be added to provide and conserve linage information.