open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.55k stars 1.05k forks source link

Enhanced Glue ingestion with external table features #18511

Closed trina242 closed 1 week ago

trina242 commented 1 week ago

Describe your changes:

Added file format, location path and external table lineage to GlueSource.

AWS Glue connector is quite poor in comparison to what you can find e.g. in AWS console. Some of the interesting features, like lineage, we can find in Athena connector - however, Glue tables can be queried by other engines, such as Trino. Athena is not a popular solution for companies holding huge amounts of data, due to costs. Fetching storage metadata in Trino is difficult, so adding them to Glue instead is a quick win.

Changes summary:

#

Type of change:

#

Checklist:

github-actions[bot] commented 1 week ago

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

sonarcloud[bot] commented 1 week ago

Quality Gate Passed Quality Gate passed for 'open-metadata-ingestion'

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
89.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud