Snowflake-Labs / OpenLineage-AccessHistory-Setup

Guideline to extract table lineage info in OpenLineage format from access history view
Apache License 2.0
10 stars 5 forks source link

Airflow example running Snowflake queries and publish OpenLineage events. #2

Closed fm100 closed 2 years ago

fm100 commented 2 years ago

Description

This airflow example uses SnowflakeOperator in order to run queries, and publishes OpenLineage events, which is auto-generated in _OPEN_LINEAGE_ACCESSHISTORY view, to the configured OpenLineage backend.

rossturk commented 2 years ago

looks good @fm100 . I left some comments. please see if those make sense. One other thing: instead of a single directory "dags", should we create two separates directories a) dag_queries - contains all the files the generate queries for dag b) dag_exract - contains etl_openlineage.py WDYT?

I agree - I chose to name them etl and lineage. My rationale is that they are already in a directory called dags, and the word extract applies to both metadata and data. Let me know if you like these names.