sinkingpoint / prometheus-gravel-gateway

A Prometheus Aggregation Gateway for FAAS applications
GNU Lesser General Public License v3.0
115 stars 10 forks source link

Metrics that do not always have a recorded value cause 400 errors. #17

Closed jsymons closed 1 year ago

jsymons commented 2 years ago

Hello, when trialing out this project I encountered a bug when a job ran that didn't have a value recorded for one of the metrics. Any metrics detailed in the PUT after the empty metric would cause the server to throw back a 400 error, the Prometheus pushgateway handles empty metrics without errors.

Sample PUT:

# HELP metric_without_values_total This metric does not always have values
# TYPE metric_without_values_total counter
# HELP metric_with_values_total This metric will always have values
# TYPE metric_with_values_total counter
metric_with_values_total{a_label="label_value",another_label="a_value"} 1.0
# HELP metric_with_values_created This metric will always have values
# TYPE metric_with_values_created gauge
metric_with_values_created{a_label="label_value",another_label="a_value"} 1.665577650707084e+09

Gives the following error back:

Invalid metric name in family. Family name is metric_without_values_total, but got a metric called metric_with_values_total

Code to reproduce (Python):

from prometheus_client import CollectorRegistry, Counter, push_to_gateway

registry = CollectorRegistry()

no_values = Counter(
    "metric_without_values",
    "This metric does not always have values",
    ["label"],
    registry=registry
)

has_values = Counter(
    "metric_with_values",
    "This metric will always have values",
    ["a_label", "another_label"],
    registry=registry
)

has_values.labels(a_label="label_value", another_label="a_value").inc()
push_to_gateway("localhost:4278", job="test_job", registry=registry)
sinkingpoint commented 1 year ago

This should be fixed now. I'll get it out in a release in a bit.

For technical context, this is another case of the Prometheus exposition format being basically entirely undocumented. I updated the openmetrics parser with a test/fix for this.