Open PatrickDuncan opened 4 years ago
cc @ggreenway who I think is our Prometheus expert at this point.
This is actually a completely different code-path than the admin prometheus stats handler. I've never looked at this before, but I think it should be a simple fix. In MetricsServiceSink::flushHistogram()
, we have this:
// Add bucket information for histograms.
io::prometheus::client::MetricFamily* histogram_metrics_family = message_.add_envoy_metrics();
histogram_metrics_family->set_type(io::prometheus::client::MetricType::HISTOGRAM);
histogram_metrics_family->set_name(envoy_histogram.name());
auto* histogram_metric = histogram_metrics_family->add_metric();
histogram_metric->set_timestamp_ms(std::chrono::duration_cast<std::chrono::milliseconds>(
time_source_.systemTime().time_since_epoch())
.count());
auto* histogram = histogram_metric->mutable_histogram();
histogram->set_sample_count(hist_stats.sampleCount());
histogram->set_sample_sum(hist_stats.sampleSum());
for (size_t i = 0; i < hist_stats.supportedBuckets().size(); i++) {
auto* bucket = histogram->add_bucket();
bucket->set_upper_bound(hist_stats.supportedBuckets()[i]);
bucket->set_cumulative_count(hist_stats.computedBuckets()[i]);
}
The +Inf
bucket is implied by the format, and isn't included in the config for buckets. So after the loop, add another bucket with bound +Inf
and count of hist_stats.sampleCount()
.
@ggreenway @mattklein123 Thank you. I should be able to make this change at some point
Title: envoyMetricsService Istio Histograms are Missing +Inf Upper Bound
Description: This is a problem with istio/proxyv2
With a normal sidecar/Kiali/Prometheus setup the Istio histogram metrics (ex.
istio_request_duration_milliseconds_bucket
) include the+Inf
bucket.With the metrics received by the gRPC envoyMetricsService the +Inf bucket is missing. This is a major issue for the histogram_quantile Prometheus function.
For now I'll just convert the highest upper bound to +Inf in my metrics service.
[optional Relevant Links:] https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/#RemoteService
Expected behavior
It should include the
+Inf
bucket. Here's what I receive right now:Steps to reproduce the bug
Run a gRPC envoy metrics service with an Istio sidecar and print the metrics received.
Version (include the output of
istioctl version --remote
andkubectl version
andhelm version
if you used Helm)