datacontract / datacontract-cli

CLI to manage your datacontract.yaml files
https://cli.datacontract.com
Other
352 stars 60 forks source link

Request: separate physical table name for a model #270

Open henri-if opened 1 week ago

henri-if commented 1 week ago

Hi,

Would it be possible to implement a way to specify physical table name for a model separately from the "logical" model name?

When making breaking changes to data products, we often want to provide multiple major versions of a data product in parallel, in order to allow downstream consumers to do a controlled migration at their own schedule. Inspired by the dbt approach to versioning models, we'd prefer to do this by providing tables with major version as the suffix, for example sales_v1 and sales_v2.

Currently specifying data contracts for such tables would mean that the model names would have to include the major version, breaking the diff functionality.

I discussed this with Simon, and the suggested solution is to add a new config field for models:

models:
  my-model:
    config:
      databricksTable: my-table-v2

The physical table name could then be used when running tests, and should likely be used in at least some of the exports.

jochenchrist commented 1 week ago

Agree, that is a good idea, and it fits well with the "databricksType" config option on field level.