Closed ltagliamonte closed 2 years ago
Interesting - I can see where this is happening, but I'm curious about the different behaviors we can have to solve this. We could reject the second push (for having a different labelset), or we could somehow munge them together to have both labels perhaps
I would hash the metric names with the labels so that each name+labels is unique.
Doing so would kind of be in violation of the Prometheus spec though (https://prometheus.io/docs/instrumenting/writing_clientlibs/#labels)
Client libraries MUST NOT allow users to have different label names for the same metric for Gauge/Counter/Summary/Histogram or any other Collector offered by the library.
So, yes, it's a but that we're mangling the label name but I think the "correct" fix is actually rejecting the second push as an invalid label set and expecting that clients properly namespace their metrics
@sinkingpoint in this way i can't use a gateway for multiple lambdas. This becomes expensive to support.
What about defining a namespace label? Each lamba uses it's own namespace in which we can have the metric name?
the prom push gateway uses a different path per job like /metrics/job/jobName/
maybe in this way it gets easier to manage in the code
Yeah I get that. FWIW, the Gravel Gateway also supports the push gateways label syntax. Using a job label seems to be the more standard solution to this, but I'll raise it internally and see what we can do
@sinkingpoint thank you, do you have any update on this? I'm preparing a lambda toolchain and i'd like to get the monitoring figure it out as well.
As i said i'd like to host the gravel gateway "as a service" for all my lambdas functions. I'd also understand better how scaling in/out works in the cluster mode, if you want i'd will open another issue.
Apologies - still trying to figure this out a bit. We're leaning to the idea that we will reject metrics pushes with invalid labels (in order to solve the initial problem), and expect consumers to supply a job
label (or similar) to distinguish individual pushes - as mentioned, we do actually support the Push Gateway URL syntax to achieve that
With #10 we've fixed this, such that the second push in the issue will be rejected with invalid labels. As before, you can use the push gateway syntax to set a label to distinguish individual pushes
@sinkingpoint thank you for working on this.
Just to make sure I understand the fix:
If the metric name contains a label job
the labels will not be overridden anymore. Effectively job
is the sharding key, correct?
The fix was to make it not override labels at all - this was the bug, that should never happen. Every push with the same metric name must have the same label names (this is required by Prometheus afaik), but the upstream pattern is to use something like a job
label to seperate individual pushes
I have 2 lambda functions pointing to the same gateway. the first lambda sends:
requests_num_total{LAMBDA_NAME="test_function"}
the second lambda sends:
requests_num_total{job="test"}
when I curl the gateway endpoint i see only:
where the label
job
has been overridden byLAMBDA_NAME
.How to reproduce
run the gateway locally
send the first metric
send the second metric
curl the gateway endpoint
curl 127.0.0.1:4278/metrics
:as you can see the label
job
has been renamed toLAMBDA_NAME
the first label sent always overrides any other following labels.