apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.05k stars 14k forks source link

Remove StatsD and replace it with Open Telemetry as first-class citizen #40800

Open kaxil opened 1 month ago

kaxil commented 1 month ago

Currently, Apache Airflow uses StatsD for metrics collection and monitoring. To modernize our observability stack and align with industry standards, we should remove StatsD and adopt OpenTelemetry as the primary metrics collection and monitoring tool for Airflow 3. This is made easier with AIP-49 Open Telemetry Support.

Backward Compatibility / Migration

Most of the enterprise Airflow deployments rely on StatsD for monitoring & alerting so we should make sure that there is a smooth migration path. Ideally, all the StatsD metrics mentioned in https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/metrics.html#metric-descriptions should have a like-by-like replacement.

High-level Objectives:

  1. Remove StatsD Integration
  2. Use OpenTelemetry as the default metrics collection tool. This includes setting up configurations, dependencies, and code integration.
  3. Documentation Update: Update the official documentation to reflect the change from StatsD to OpenTelemetry, including setup, configuration, and usage instructions. Include migration guide for users transitioning from StatsD to OpenTelemetry if there are breaking changes.
  4. Backward Compatibility: Ensure that any existing workflows dependent on StatsD are either seamlessly transitioned to OpenTelemetry or provide clear migration guidelines.
josix commented 1 month ago

Hi @kaxil I'm interested in this issue, may I have a try for this, thanks!

kaxil commented 1 month ago

Awesome, assigned it to you

dirrao commented 1 month ago

Hi @josix Let me know if you need any help.

raphaelauv commented 1 month ago

also provide a simple grafana dashboard in the example stack ( cause the main airflow grafana dashboard open source is based on statsd ) would help a lot

kaxil commented 1 month ago

@howardyoo is also interested in working on this. He led AIP-49, so @josix if you need help please let Howard no.

@howardyoo is equally interested in leading this effort too

josix commented 1 month ago

Yeah, I just checked out the AIP and the sharing in the Airflow Submit these days, and currently worked on studying the codebase around StatsD and OTel. I believe it would be a better choice for @howardyoo to lead this topic 🙂 Please feel free to unassign me and assign sub-items to me if possible, thanks for the coordination.

dirrao commented 1 month ago

I am interested too in this. Please assign me any tasks needed to be done. I am working on similar lines on this in this PR https://github.com/apache/airflow/pull/39908

kaxil commented 1 month ago

@howardyoo Could you add a comment, please? GitHub doesn't allow me to assign you an issue until you have commented on it since you aren't part of the Apache org

howardyoo commented 1 month ago

@howardyoo Could you add a comment, please? GitHub doesn't allow me to assign you an issue until you have commented on it since you aren't part of the Apache org

Ok, done!

kaxil commented 1 month ago

Awesome, assigned it to you.

howardyoo commented 2 days ago

Oh, I guess the above mentioned PR draft got released, which is nice - but looks like it may be on hold for now.