Open tatiana opened 1 month ago
A few questions:
dag_id
and task_group_id
into the metric names, when appropriate?
task_group_id
with self
or dag
or something like that, so the metric naming is a little more consistent.cosmos.load_method_custom.duration
, cosmos.load_method_dbt_ls.duration
, etc. but would it make sense to do something more like cosmos.load_graph.duration
or cosmos.graph.{dag_id}.{task_group_id}.duration
? My thinking is:
DbtGraph
with a custom load method.Hey, @dwreeves, these are very valid points.
I'm improving the logs on a per DAG/TaskGroup as part of #1014 (e.g., https://github.com/astronomer/astronomer-cosmos/pull/1014/files#diff-61b585fb903927b6868b9626c95e0ec47e3818eb477d795ebd13b0276d4fd76cR293). This will probably be switched to DEBUG and be further improved, but this would help to address the granularity your suggestion. I'll probably create a PR only for this :)
The goal with having the metrics proposed in this PR is to really have a "group" that helps to have an overview of the health of these numbers across multiple DAGs - and help spot overall if any of these metrics are looking more troublesome than others. WDYT?
Context
We want more visibility on how much Cosmos spends while parsing the dbt project and building the Airflow DAG.
We'd like to leverage Airflow Metrics collection system by using:
To collect the following metrics:
cosmos.load_method_custom.duration
: time taken to runDbtGraph.load_via_custom_parser
cosmos.load_method_dbt_ls.duration
: time taken to runDbtGraph.load_via_dbt_ls
cosmos.load_method_dbt_ls_file.duration
: time taken to runDbtGraph.load_via_dbt_ls_file
cosmos.load_method_manifest.duration
: time taken to runDbtGraph.load_from_dbt_manifest
cosmos.convert_to_airflow.duration
: time taken to run `build_airflow_graph``cosmos.dag_init.duration
: time taken to initialise the Airflow DAGcosmos.dag_new.duration
: time taken to create the Airflow DAGcosmos.task_group_init.duration
: time taken to initialise the Airflow DAG (__init__
)cosmos.task_group_new.duration
: time taken to create the Airflow DAG (__new__
)Relevant parts of the code:
https://github.com/astronomer/astronomer-cosmos/blob/cda2a5058bb3c95f1c2e1b9a5352f8ceb7b22f6a/cosmos/dbt/graph.py#L168-L171
https://github.com/astronomer/astronomer-cosmos/blob/cda2a5058bb3c95f1c2e1b9a5352f8ceb7b22f6a/cosmos/airflow/graph.py#L215
https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/airflow/dag.py https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/airflow/task_group.py
Acceptance criteria