Closed m1k3 closed 6 years ago
Drive-by comment from someone without context (happen to be subscribed to this repo because I think someone added me): wouldn't you also need to add a similar check for _count
and potentially other fields? Missing either _count
or _sum
is not a valid Prometheus summary metric, so I'm not sure I understand why one would have to be checked but not the other.
You're probably right about other fields. For now I would prefer to validate whether or not my approach is valid within the context of what is happening while parsing. Once we determine that we can extend it to other fields.
I ran into this, because I was trying to expose Dropwizard metrics in the prometheus format using this example and the maintainer was very clear that he does not want to expose the _sum
metric because Dropwizard does not provide data for it and that it is a completely valid thing to do with the Prometheus format to omit the _sum
metric.
@m1k3 Hmmm, as far as I can see that example only exposes a normal counter metric (with _total
suffix), not a summary metric. Am I missing something? Pointers at related discussions welcome...
The example is not exposing a summary, but in principle it is possible to expose Dropwizard histograms as Prometheus summaries but if you look at the code the _sum
is missing. This is exactly what I added to the example and sure enough there was no _sum
. Have a look at my PR for the discussion.
Interesting, I had never heard that a summary without _sum
is considered valid in Prometheus, but ok. You can still encounter /metrics endpoints that have other faults like missing the _count
series for a summary or histogram, and you probably don't want to crash on them either (as you mentioned, you plan on checking more fields later).
Thanks for the PR! Can you address @juliusv 's feedback?
@fabxc FYI
@jkohen I'm not sure which part of the feedback are you referring to. Do you mean adding the check for SampleCount
?
Yes, exactly.
On Thu, Jul 12, 2018, 13:33 Michal Olah notifications@github.com wrote:
@jkohen https://github.com/jkohen I'm not sure which part of the feedback are you referring to. Do you mean adding the check for SampleCount?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Stackdriver/stackdriver-prometheus/pull/8#issuecomment-404590246, or mute the thread https://github.com/notifications/unsubscribe-auth/AG0TaH1_MDqiBIzwaEAqsxjTm33KKLz7ks5uF4iEgaJpZM4VCt4G .
@jkohen I added the check for the missing _count
and updated the test to expose when such a situation would occur.
Awesome, thx. I thought we'd cleanup the test I added to point out the problem more clearly (there some extra variables in there and a more-or-less duplicated helper function) before we merge, but too late now I guess.
which can happen in the
_sum
metric is missing.Some prometheus exporters are not exposing the
_sum
metric of a summary which will cause the parser to crash. I believe it is the root cause of #5.I added a test based on
TestEmptySum
that exposes the issue. It is just a quick copy&paste ofTestPointExtractor
to expose the issue and I'd welcome some feedback to make it less noisy 😉 .I also added a check to make sure that the
metric.Summary.SampleSum
is not nil when we try to decrement theresetValue
from it. I'm not sure it is the best way to solve the issue, but I ran the code in my cluster and it seems to solve the problem.