crate / crate-clients-tools

Clients, tools, and integrations for CrateDB.
https://crate.io/docs/clients/
Apache License 2.0
2 stars 1 forks source link

Marquez: The complete OpenLineage solution #116

Open amotl opened 1 month ago

amotl commented 1 month ago

About

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata. It maintains the provenance of how datasets are consumed and produced, provides global visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more. Marquez was released and open sourced by WeWork.

Highlights

One Source of Truth

Marquez enables consuming, storing, and visualizing OpenLineage metadata from across an organization, serving use cases including data governance, data quality monitoring, and performance analytics.

One Service for Lineage

Resources

amotl commented 1 month ago

Is it true that you may have worked with that application a while ago, @hlcianfagna? If you can recall the outcome, do you remember if it is suitable to be used together with CrateDB?

/cc @wierdvanderhaar

hlcianfagna commented 1 month ago

It is true, yes, and it was working fine with CrateDB, but we did not arrive to publish an article on it, let me share with you what I got by private message.

amotl commented 1 month ago

Wonderful. Thanks for your reply. Maybe @matkuliak could slot it in, to converge your outcome into a corresponding tutorial/documentation, in the same spirit like/after finishing https://github.com/crate/cratedb-guide/pull/75?

amotl commented 1 month ago

Hi. I converted a document shared by @hlcianfagna (thanks!) into Markdown format, and added it as an article on the community forum, still in "unlisted" mode. It will need verification, possible updates, and a few formatting improvements. If someone wants to pick up from there...?

-- OpenLineage with Airflow, Marquez, and CrateDB