Open NBardelot opened 1 month ago
If someone picks this up, just keep in mind we can't just add the mapping naively (see this warning).
Maybe we can have a way for users to opt-in to a set of mappings that are kept up to date? But the way this was originally built severely limits what we can do without introducing breaking changes :(
Apache Airflow version
2.9.1
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Some metrics using tags (
file_path
,dag_id
,task_id
essentially) are not corretly mapped in the Helm chart (seechart/files/statsd-mappings.yml
). This is probably linked to a feature in Airflow v2.6 that allowed to avoid creating a new metric for each new DAG/task/file, and started to use tags instead, under common metrics.Yet I've stumbled upon
airflow_dag_processing_last_duration
having no label in my Prometheus, and found it was not mapped. I've added this as a workaround for the moment:What you think should happen instead?
Every metric being logged using tags should be mapped in
chart/files/statsd-mappings.yml
in order for labels to be applied by the statsd-exporter.As of Airflow 2.9.1 this is a list of calls to the Stats class that I think are using tags but missing a mapping:
dag_processing.processes
dag_file: "$1"
dag_processing.last_duration
dag_file: "$1"
dag_processing.processor_timeouts
dag_file: "$1"
sla_missed
dag_id: "$1"
,task_id: "$2"
sla_email_notification_failure
dag_id: "$1"
,task_id: "$2"
dag_file_refresh_error
dag_file: "$1"
pool.queued_slots
pool: "$1"
pool.running_slots
pool: "$1"
pool.deferred_slots
pool: "$1"
zombies_killed
dag_id: "$1"
,task_id: "$2"
dag.callback_exceptions
dag_id: "$1"
task_restored_to_dag
dag_id: "$1"
,task_id: "$2"
task_removed_from_dag
dag_id: "$1"
,task_id: "$2"
task_instance_created
dag_id: "$1"
,task_id: "$2"
Note: as this is a result of a quick
grep
this list might be incomplete and I might have misunderstood some of the metrics behaviour... The person who wants to provide a fix should not take it for absolute truth...How to reproduce
curl
on the statsd exported endpoint's/metrics
in a nearby poddag_processing_last_duration
anddag_processing_last_duration_{DAG_id}
metrics both existdag_processing_last_duration
lacks thedag_file
labelOperating System
Kubernetes
Versions of Apache Airflow Providers
The 'statsd' requirements are installed using the official Apache constraints for Python 3.10 and Airflow 2.9.1.
Deployment
Official Apache Airflow Helm Chart
Deployment details
No
.Values.statsd.overrideMappings
(seechart/templates/configmaps/statsd-configmap.yaml
), we use the standard out-of-the-box mappings.Anything else?
No response
Are you willing to submit PR?
Code of Conduct