apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.16k stars 14.04k forks source link

Simplify metrics handling #38655

Open ferruzzi opened 5 months ago

ferruzzi commented 5 months ago

What happened?

When OTel support was added, it enabled adding tags to metrics. Metrics such as ti.start.<dag_id>.<task_id> became ti.start with the tags for dag_id and task_id. Statsd does not support tagging, so all metrics had to be emitted twice; once with tagging and once assembled as a single long name. Over time, we've forgotten this and some newer metrics have been added which only emit one way or the other.

What you think should happen instead?

I propose that airflow.metrics.statd_logger should be modified to assemble the name and tags into a long name, then any and all double-emitted metrics can be consolidated and future metrics can be easier and more consistently implemented.

How to reproduce

Example of some "double-emitted" metrics: image

in this case, statsd.incr() accepts the tags but does nothing with them. If it did a ("_").join(name, tags) before emitting, then we could drop the redundant metric and simplify future additions.

Are you willing to submit PR?

Code of Conduct

potiuk commented 5 months ago

Yes. Good idea. Let's do it.

ferruzzi commented 5 months ago

Not sure it really qualifies as a bug, but it didn't really feel like a "feature request" either. Feel free to adjust my tagging.

I've added "good first issue" because this ins one of those "easy to implement but time consuming" ones IMHO.

If anyone wants to take it on and needs some guidance, hit me up on Slack

Jack-R-lantern commented 5 months ago

please assign me

ferruzzi commented 5 months ago

I think the easiest answer would be to convert the existing metrics tags dicts to ordereddicts, then in the statsd_logger module you can join them and know they'll always be the same order.

Ibrahim-Mukherjee commented 5 months ago

Hi I would be willing to work on this.

Jack-R-lantern commented 4 months ago

@ferruzzi Hi. I came across this article while troubleshooting an issue. It looks like it's okay to combine metric names using dict instead of using ordereddict, what do you think?

ferruzzi commented 4 months ago

hmmmmmmm. yeah, I think maybe you are right. As of 3.8, dicts are guaranteed to retain the order they were inserted and we require 3.8+ for Airflow so.... yeah, that should work and make things easier. Sorry about that.