open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.58k stars 1.05k forks source link

Lineage to ingest SQL Server SSIS packages #16296

Open Jigawatter opened 6 months ago

Jigawatter commented 6 months ago

Problem: We use SQL Server SSIS packages heavily to extract/transform/load data between multiple servers and sources. It's very difficult and frustrating to keep track of these packages and manage dependencies. Users often ask "What is the source of this data?" and it can be difficult for us to answer confidently without significant effort, due to these packages not being mapped. Additionally, some of these packages contain business logic that is difficult to access without opening each package individually.

OMD currently shows the lineage of database objects, but mapping across databases and servers via these packages is not possible.

Solution: I would like to see ingestion workflows for SQL Server SSIS that can extend the existing lineage for SQL Server (and other sources/destinations), such that in the Lineage view for a data asset i can see that it feeds to or from an SSIS package and then onto another data asset elsewhere in our environment.

Alternatives: We've tried to manually document in a spreadsheet all the SSIS packages we have, but it was challenging to maintain and quickly drifted out of date. Currently we are investigating what data can be automatically extracted via scripts from the SSIS xml files or the SSIS catalog db - maybe then we can import that into OMD and manually update the lineage.

Mahdi-Seeker commented 4 months ago

Any progress on this issue?