open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.47k stars 1.04k forks source link

Manually model the lineage #5939

Closed Arturo-Penas-Rial closed 1 year ago

Arturo-Penas-Rial commented 2 years ago

Ability to manually document flow and dashboard links, and define the link or ID of the object in the description. Especially useful for representing lineage when not using applications that support OpenMetadata integration.

When I try to manually model the lineage between two tables I can link them, but I can't document the link. I try to manually create a "pipeline" by assigning it to the name of the ETL process, for example, but I cannot document where it is executed, nor can I link to the source and destination tables through the "pipeline". The same thing happens when trying to link "dashboards". Finally, I only link two tables, and if I change the label, it does not persist or warn, the edit made is lost. Allowing the above is mandatory to be able to work modeling manual aspects and that there is no integration, for example, with PDI (Pentaho).

Often the technology stack used means that we use technology that does not directly use OpenMetadata (for example Pentaho or Nifi), therefore you cannot maintain the use of the information automatically, and you must define it manually, for example to model a set of data that generates one in output (measurements/indicators), or to model a manual ETL process for which it is necessary to be able to enter a name and description, and perhaps a link to data relationships with data flows and dashboards. These relationships may have a cardinality n.. n. These links cannot currently be described.

pmbrull commented 2 years ago

So far we will keep working on improving the support for lineage of current Entities. Nifi will be added in 0.12