Closed Sai7656 closed 1 year ago
We are capturing the column level lineage on the relationship edge from one table to another. Since this edge can have multiple column lineage for multiple columns from table 1 to table 2, this is an array.
Closing this as not a bug. @Sai7656, it would be great if you can confirm that an issue is indeed a bug in OpenMetadata support channel before opening the bug. Thank you.
We are capturing the column level lineage on the relationship edge from one table to another. Since this edge can have multiple column lineage for multiple columns from table 1 to table 2, this is an array.
Closing this as not a bug. @Sai7656, it would be great if you can confirm that an issue is indeed a bug in OpenMetadata support channel before opening the bug. Thank you.
Hi @sureshms I spoke with @ulixius9 and raised the issue here. https://openmetadata.slack.com/archives/C02B6955S4S/p1683719317889769
Also, I have a lineage in OM UI which is as shown in the picture. Here two columns(startdate and enddate) are used to populate a column(Totalworkingdays). But when I get the lineage through API fromColumns doesn't hold array of values. It holds two different blocks as below.
{
"fromColumns": [
"AzureDatabricks_DataFabric.hive_metastore.humanresources.employee_department_history_silver.EndDate"
],
"toColumn": "AzureDatabricks_DataFabric.hive_metastore.openmetadata_poc.employee_shift_vw.TotalWorkingDays"
},
{
"fromColumns": [
"AzureDatabricks_DataFabric.hive_metastore.humanresources.employee_department_history_silver.StartDate"
],
"toColumn": "AzureDatabricks_DataFabric.hive_metastore.openmetadata_poc.employee_shift_vw.TotalWorkingDays"
},
@Sai7656 this is indeed a bug. Thank you for adding details. @ulixius9, in this case we should include both the StartDate
and EndDate
in fromColumns
and TotalWorkingDays
in the to column. We need to make sure UI shows it correctly with two lines starting from the upstream table and merging together before connecting to the destination table.
@Sai7656 and @ulixius9, We have also another case. Let's say table1
column1
is used in addition table2
column2
to create table3
column3
. In this case:
table1
and table3
has fromColumns
table1.column1
and toColumn
table2.column3
.table2
and table3
has fromColumns
table2.column2
and toColumn
table2.column3
table1
and table2
merging to gether before connecting to table3
Ping me on slack if this not clear.
Affected module Does it impact the UI, backend or Ingestion Framework? - Backend
Describe the bug A clear and concise description of what the bug is. - The "fromColumns" section in the column lineage json is having list of values as below even though it holds one value and for many to one column mapping as well all lineage appears as single entities.
"fromColumns": [ "AzureDatabricks_DataFabric.hive_metastore.humanresources.employee_department_history_silver.EndDate" ]
To Reproduce - NA
Screenshots or steps to reproduce
Expected behavior A clear and concise description of what you expected to happen. - As all column level lineage are one to one it's better to have this field as string just like "toColumn" and change the column name from "fromColumns" to "fromColumn".
Version:
openmetadata-ingestion[docker]==XYZ
] 1.0.0Additional context Add any other context about the problem here.