prometheus / statsd_exporter

StatsD to Prometheus metrics exporter
Apache License 2.0
915 stars 230 forks source link

stackstorm mapping - match_metric_type not working right? #544

Closed Bubba210 closed 6 months ago

Bubba210 commented 6 months ago

i'm trying to append metrics with their type because i'm trying to get my stackstorm metrics into prometheus using this exporter and the exposed metrics have both counters and timers with the same metric name (https://docs.stackstorm.com/reference/metrics.html#exposed-metrics). So, i wrote a mapping file with the following:

- match: "(.+)"
    match_type: regex
    match_metric_type: observer
    name: "${1}_seconds"

  - match: "(.+)"
    match_type: regex
    name: "${1}_counter"
    match_metric_type: counter

  - match: "(.+)"
    match_type: regex
    name: "${1}_gauge"
    match_metric_type: gauge

This works as expected, but ST2 exposes some more detail in a "fun" way of exposing detailed metrics...

st2.action.\<action ref\>.executions | counter 
st2.action.\<action ref\>.executions | timer

where \<action ref> contains (2) useful pieces of information i want to include as labels... so i tried

  - match: "st2\\.action\\.(\\w+)\\.(\\w+)\\.executions(.*)"
    match_type: regex
    name: "st2_action_executions${3}"
    labels:
      st2_library: "${1}"
      st2_action: "${2}"`

This did exactly what i wanted but i lost the timer metrics entirely and started getting "Failed to update metric" ....is already registered issues.

SO, i thought, OK i'll break these up an leverage match_metric_type and specify the difference before the "catch alls" at the bottom of the mapper

- match: "st2\\.action\\.(\\w+)\\.(\\w+)\\.executions(.*)"
    match_type: regex
    metric_match_type: observer
    name: "st2_action_executions${3}_seconds"
    labels:
      st2_library: "${1}"
      st2_action: "${2}"

  - match: "st2\\.action\\.(\\w+)\\.(\\w+)\\.executions(.*)"
    match_type: regex
    metric_match_type: counter
    name: "st2_action_executions${3}_counter"
    labels:
      st2_library: "${1}"
      st2_action: "${2}"

but even with this... no luck... more "already registered" issues... so it seems like match_metric_type doesn't ALWAYS work right?

not quite the same as #355 because i can add the type to the metric name, but i can't seem to both add the type to the metric name AND create labels based on the metric name

if i submit

st2.action.fubar.pson_to_json.executions.scheduled.something:3.30400000|ms
st2.action.fubar.pson_to_json.executions.scheduled.something:1|c

i HOPE to see:

st2_action_executions_scheduled_something_seconds{st2_action="pson_to_json",st2_library="fubar"} 0.003304....
st2_action_executions_scheduled_something_counter{st2_action="pson_to_json",st2_library="fubar"} 1
matthiasr commented 6 months ago

I think the problem in your last configuration example is that you used metric_match_type (which is unknown and silently ignored) but it should be match_metric_type.

That being said, do you need the counter metrics at all? When the exporter records a timer (as a Prometheus summary or histogram), this includes a _count metric that records how many times something happened, alongside the aggregate information (buckets or quantiles). In this example I dropped the counter metric altogether, notice the st2_action_executions_scheduled_something_seconds_count:

❯ printf 'st2.action.fubar.pson_to_json.executions.scheduled.something:3.30400000|ms\nst2.action.fubar.pson_to_json.executions.scheduled.something:1|c\n' | nc localhost 9125
❯ curl -s http://localhost:9102/metrics | fgrep st2
# HELP st2_action_executions_scheduled_something_seconds Metric autogenerated by statsd_exporter.
# TYPE st2_action_executions_scheduled_something_seconds summary
st2_action_executions_scheduled_something_seconds{st2_action="pson_to_json",st2_library="fubar",quantile="0.5"} 0.0033039999999999996
st2_action_executions_scheduled_something_seconds{st2_action="pson_to_json",st2_library="fubar",quantile="0.9"} 0.0033039999999999996
st2_action_executions_scheduled_something_seconds{st2_action="pson_to_json",st2_library="fubar",quantile="0.99"} 0.0033039999999999996
st2_action_executions_scheduled_something_seconds_sum{st2_action="pson_to_json",st2_library="fubar"} 0.0033039999999999996
st2_action_executions_scheduled_something_seconds_count{st2_action="pson_to_json",st2_library="fubar"} 1
mapping.yaml ```yaml mappings: - match: "st2\\.action\\.(\\w+)\\.(\\w+)\\.executions(.*)" match_type: regex match_metric_type: observer name: "st2_action_executions${3}_seconds" labels: st2_library: "${1}" st2_action: "${2}" - match: "st2\\.action\\.(\\w+)\\.(\\w+)\\.executions(.*)" match_type: regex match_metric_type: counter action: drop name: dropped ```
Bubba210 commented 6 months ago

ugh... do you know those moments where you're looking at a problem too long and you can't see the obvious GLARING issue? (oh the shame...)

Still, would be nice if nonsensical nonsense in the config file would post a warning in the debug logs... :)

Thanks for your detailed answer, we do run a lot of jobs in parallel and i suspect the counter value will reflect that and exposed timer will be an aggregation, but i'll certainly investigate and let you know.

matthiasr commented 6 months ago

It took me a lot of trying and head scratching to figure it out as well 😂 I think we can do better and fail on configurations with unexpected fields, and filed a follow-up issue for that.