datamesh-manager / datamesh-manager-ce

Data Mesh Manager (Community Edition)
https://www.datamesh-manager.com
26 stars 2 forks source link

Chain Data Contracts within single Data Product #5

Open maikelpenz opened 1 day ago

maikelpenz commented 1 day ago

In my current setup, there are some steps between ingesting data and producing the final data product table.

For example, I don't treat intermediate tables as data products. Instead, I created data contracts for them for reference and my "data products" refer to one or more tables that directly deliver value to the customer. My structure aligns with the medallion architecture as the following example:

Data Product: Sales Silver Input Port: Upstream Database Output Ports: Ingestion, Bronze, Silver

All the output ports listed above are represented as data contracts, following this logical sequence: Upstream Database → Ingestion → Bronze → Silver

It doesn’t make sense to create data products from the Ingestion and Bronze stages because those tables are not customer-facing.

What I wish was possible:

jochenchrist commented 1 day ago

Hi @maikelpenz

Thanks for the feedback and this feature request.

Currently, I'd recommend these options:

  1. Consider raw and bronze tables as internal details of the data product. Do not define them as output port. You can use assets (https://api.datamesh-manager.com/swagger/index.html#/Assets) (<- new feature) to assign these tables/views to a data product.
image
  1. If you want to have a data contract for your source data, define a proxy data product (Sales Raw / Sales Bronze), which is internal to the team. We have in backlog a feature to define the visibility of data products.
image

I hope this helps.