prometheus / statsd_exporter

StatsD to Prometheus metrics exporter
Apache License 2.0
921 stars 231 forks source link

[Feature Request] Support DogStatsD in Extended Aggregation mode #557

Closed GrgDev closed 2 months ago

GrgDev commented 5 months ago

Problem

This tool has worked almost flawlessly for me. However, when using it to try to move one of my tenants from statsd to prometheus, we noticed that the timing metrics were all showing a flat line series when they shouldn't be. Long story short, we found out it was because they were using the extended aggregation config option to reduce network load. See the docs on this mode here.

In case that link gets broken later, here's the TL;DR description. Instead of sending:

my_distribution_metric:21|d|#all,my,tags
my_distribution_metric:43.2|d|#all,my,tags
my_distribution_metric:1657|d|#all,my,tags

it sends

my_distribution_metric:21:43.2:1657|d|#all,my,tags

This only gets applied to distributions, histograms, and timers. Counts and gauges still act as expected. It would be great if the exporter could handle this.

Proposed Possible Solution

I'm a novice in Golang so take my suggestion with a grain of salt.

The LineToEvents function seems to be called on three different paths with all three being packet handling functions.

The HandleConn function uses a for loop with Readline instead of a split on newlines, so this approach couldn't be copied there. That being said, that function looks like it is for handling TCP connections which I don't think DogStatsD ever uses. That function might not need to be changed at all.

I originally thought that maybe this code change could be done inside of LineToEvents so the change would only need to be made in one place, but I think the logic there assumes that the one line passed in will stay as one line instead of being expanded.

Anyways, y'all know this code way better than me and I don't know if this change would cause side effects to the existing behavior. Let me know what you think, and thanks again for maintaining this.

glightfoot commented 5 months ago

I think the correct place to handle this would be in LineToEvents. Technically that function does convert a single line into multiple events in the case of sampled metrics. It would need to be refactored to support multiple values, but that's not terrible. I might be able to take a stab at this in a few days if work doesn't get too busy.

GrgDev commented 5 months ago

Thank you fellow Greg

GrgDev commented 5 months ago

I went ahead and took a stab at implementing this in PR #558 . As noted in the PR description, there's still the outstanding question should statsd exporter should be the enforcer of the expectation that the extended aggregate metrics feature will only ever be applied to distributions, histograms, and timers while counts and gauges remain unchanged?

matthiasr commented 2 months ago

This is released now!