prometheus / statsd_exporter

StatsD to Prometheus metrics exporter
Apache License 2.0
917 stars 229 forks source link

af_agg_ti_start and af_agg_ti_finish are missing #486

Closed simoncao2022 closed 1 year ago

simoncao2022 commented 1 year ago

We can see 'af_agg_ti_failures' but 'af_agg_ti_start' and 'af_agg_ti_finish' are missing in prometheus query result, we're using very standard Airflow StatsD metrics mappings settings. We're using such staffs: prom/statsd-exporter:v0.22.5 /bin/statsd_exporter --log.level info --statsd.mapping-config /home/statsd-mapping-configs/statsd.yml --statsd.listen-udp=:9125 --web.listen-address=:9082

statsd.yml is like this:

mappings:
  - match: "*.ti_failures"
    match_metric_type: counter
    name: "af_agg_ti_failures"
    labels:
      airflow_id: "$1"
  - match: "*.ti_successes"
    match_metric_type: counter
    name: "af_agg_ti_successes"
    labels:
      airflow_id: "$1"
  - match: "*.ti.start.*.*"
    match_metric_type: counter
    name: "af_agg_ti_start"
    labels:
      airflow_id: "$1"
      dag_id: "$2"
      task_id: "$3"
  - match: "*.ti.finish.*.*.*"
    match_metric_type: counter
    name: "af_agg_ti_finish"
    labels:
      airflow_id: "$1"
      dag_id: "$2"
      task_id: "$3"
      state: "$4"
matthiasr commented 1 year ago

What is the statsd event type for the start and finish metrics? Please run the exporter with the debug log level, so that it reports the raw payloads it received, and post examples of the raw events here. Without these, it is a guessing game to help you debug this.

andscoop commented 1 year ago

Without knowing more, my first guess is that this is a configuration issue on the workers for your Airflow cluster. You'll want to ensure that they have matching statsd configurations as the scheduler pods. Another helpful debug step would be to test connectivity to the udp port from the worker with nc or telnet.

matthiasr commented 1 year ago

Closing this due to lack of information – please feel free to reopen it with the debug logs!