cimt-ag / data_vault_pipelinedescription

A concept and syntax to provide a universal data format, for storing all essential informations, that are needed to implement or generate a data loading process for a data vault model.
https://www.cimt-ag.de/leistungen/data-vault-pipeline-description/
Apache License 2.0
3 stars 0 forks source link

Explain pipeline asset/source driven design paradigm #268

Closed mattywausb closed 1 month ago

mattywausb commented 4 months ago

Dvpd describes the mapping by taking the source structure and distributing it to the target model instead of describing the target structure and defining the source of every target column.

Why?

Why is Dvpd core syntax not designed to represent the full linage?

The assembly of a value/measure/KPI can be arbitrary complex and can't be represented efficiently in a way, that is translatable into executable code. Therefore Dvpd definition begins behind the calculation. Nevertheless, annotational syntax can be added to provide and conserve linage information.