As a data platform owner I want to be able to provide a single location for discovering, documenting, and understanding the lineage of data
As a data platform consumer I want to be able to find data relevant to my needs and understand its context
Description/Context
In order to provide a cohesive view of our data platform and the various data sets that are available we want to implement a cross-cutting metadata platform. This will provide visibility into the full lineage graph of a given data asset (e.g. database table, dashboard report, report export delivered via Dagster, etc.). There are numerous open source and commercial options available, so the purpose of this issue is to establish a set of evaluation criteria and select a solution that we would like to implement.
Acceptance Criteria
[ ] Integrations are available for consuming metadata from our various platform components
[ ] Dagster
[ ] Trino
[ ] dbt
[ ] Superset
[ ] Supports column level lineage
[ ] Has search/discovery functionality for data sets
[ ] Provides a means of documenting data sets
[ ] Supports tagging and tag propagation based on lineage (e.g. tag a column in raw with pii and propagate to mart tables/dashboards)
Plan/Design
Review relevant documentation and pricing information available for each platform. Perform a simple proof of concept implementation of the top contenders.
User Story
Description/Context
In order to provide a cohesive view of our data platform and the various data sets that are available we want to implement a cross-cutting metadata platform. This will provide visibility into the full lineage graph of a given data asset (e.g. database table, dashboard report, report export delivered via Dagster, etc.). There are numerous open source and commercial options available, so the purpose of this issue is to establish a set of evaluation criteria and select a solution that we would like to implement.
Acceptance Criteria
pii
and propagate to mart tables/dashboards)Plan/Design
Review relevant documentation and pricing information available for each platform. Perform a simple proof of concept implementation of the top contenders.