At the moment users don't have visibility of which DAGs were created using Python directly, and which DAGs were created using DAG Factory. The goal of this ticket is
Some options we brainstormed:
Generate some events during the conversion of DAG Factory using https://github.com/apache/airflow/pull/39510 (would only work for newer versions of Airflow, or we'd need to have DAG Factory using this tool to emit them)
We could have a Astro-specific solution. If Astro exposed some parameter to access the next layer, that would allow us to push data to SF metadata db. Then, probably from DAG Factory code we push some data (e.g. deployment data, DAG source e.g. DAG Factory)
Assoaciate metadata / DAG factory to either DAG runs or Task runs (check if we could leverage https://github.com/apache/airflow/pull/38650 in any way). In this case, we'd need to confirm with the Data team if this data is already stored in Snowflake - or if it could be stored.
This starts as a PoC, the idea is to identify strategies and have a way, that would require the least change and configuration to end-users, so we could collect (at least in Astro) information about which DAGs were created with DAG Factory.
Acceptance criteria
[ ] Summary of approaches attempted
[ ] Working PoC sharing this data in a way we can consume from Astro SF
Context
At the moment users don't have visibility of which DAGs were created using Python directly, and which DAGs were created using DAG Factory. The goal of this ticket is
Some options we brainstormed:
This starts as a PoC, the idea is to identify strategies and have a way, that would require the least change and configuration to end-users, so we could collect (at least in Astro) information about which DAGs were created with DAG Factory.
Acceptance criteria