kiali / kiali

Kiali project, observability for the Istio service mesh
https://www.kiali.io
Apache License 2.0
3.39k stars 486 forks source link

Add visualization of request "retry" #5417

Open jshaughn opened 2 years ago

jshaughn commented 2 years ago

Multiple users have asked how that can better "see" if their request retry configuration is working as expected. Currently Kiali doesn't have explicit visualization for this feature. We can define and visualize the config, but not show the actual retry in an explicit way (implicitly it may be able to be seen via changes in response time success rate).

Seeing this reflected on edges, or in a chart, could be nice. As far as I can tell the telemetry is not reflected in istio_* metrics, only in Envoy telemetry:

upstream_rq_retry | Counter | Total request retries -- | -- | -- upstream_rq_retry_backoff_exponential | Counter | Total retries using the exponential backoff strategy upstream_rq_retry_backoff_ratelimited | Counter | Total retries using the ratelimited backoff strategy upstream_rq_retry_limit_exceeded | Counter | Total requests not retried due to exceeding the configured number of maximum retries upstream_rq_retry_success | Counter | Total request retry successes upstream_rq_retry_overflow | Counter | Total requests not retried due to circuit breaking or exceeding the retry budget

Also, @nrfox mentioned that this telemetry may not be enabled by default.

jshaughn commented 8 months ago

A note from Slack on how to enable appropriate metrics:

nfox: Not sure if this applies to circuit breaker but iirc for retry metrics I also had to add cluster to inclusionPrefixes because the default is to filter out all clusters except grpc-xds

Michaela Lang yep we have been adding deploy/pod annotations to enable those required metrics to be available...

$ oc get deploy/mockbin -o yaml | yq -r .spec.template.metadata.annotations
kubectl.kubernetes.io/restartedAt: "2024-01-25T08:24:09+01:00"
proxy.istio.io/config: |-
  proxyStatsMatcher:
    inclusionRegexps:
      - ".*outlier_detection.*"
      - ".*upstream_rq.*"
      - ".*upstream_cx.*"
      - ".*circuit_breaker.*"
jmazzitelli commented 3 months ago

Seeing this reflected on edges, or in a chart, could be nice. As far as I can tell the telemetry is not reflected in istio_* metrics, only in Envoy telemetry:

We already have some Envoy metrics defined - so if you have sidecars (this won't work in Ambient), I wonder if we could just add metrics charts with the data people want to see - so we'd add something here:

image

I don't know if this is what people are asking for, but something like this custom dashboard is supported by the Kiali CR today:

spec:
  custom_dashboards:
  - name: envoyretries
    title: Envoy Retries
    discoverOn: envoy_cluster_upstream_rq_retry
    items:
    - chart:
        dataType: raw
        metrics:
        - metricName: envoy_cluster_upstream_rq_retry
        name: Envoy Retries
        spans: 2

For the record, here's what the metric timeseries look like (this is with bookinfo demo installed):

envoy_cluster_upstream_rq_retry{app="istio-egressgateway", chart="gateways", cluster_name="xds-grpc", heritage="Tiller", install_operator_istio_io_owning_resource="unknown", instance="10.217.0.43:15020", istio="egressgateway", istio_io_rev="default", job="kubernetes-pods", namespace="istio-system", node="crc", operator_istio_io_component="EgressGateways", pod="istio-egressgateway-5dc875ddcf-ckwhr", pod_template_hash="5dc875ddcf", release="istio", service_istio_io_canonical_name="istio-egressgateway", service_istio_io_canonical_revision="latest", sidecar_istio_io_inject="false"}

envoy_cluster_upstream_rq_retry{app="istio-ingressgateway", chart="gateways", cluster_name="xds-grpc", heritage="Tiller", install_operator_istio_io_owning_resource="unknown", instance="10.217.0.41:15020", istio="ingressgateway", istio_io_rev="default", job="kubernetes-pods", namespace="istio-system", node="crc", operator_istio_io_component="IngressGateways", pod="istio-ingressgateway-54cc4c599d-r2kn5", pod_template_hash="54cc4c599d", release="istio", service_istio_io_canonical_name="istio-ingressgateway", service_istio_io_canonical_revision="latest", sidecar_istio_io_inject="false"}

envoy_cluster_upstream_rq_retry{app="details", cluster_name="xds-grpc", instance="10.217.0.157:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="details-v1-cf74bb974-2xst9", pod_template_hash="cf74bb974", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="details", service_istio_io_canonical_revision="v1", version="v1"}

envoy_cluster_upstream_rq_retry{app="kiali-traffic-generator", cluster_name="xds-grpc", instance="10.217.0.163:15020", job="kubernetes-pods", kiali_test="traffic-generator", namespace="bookinfo", node="crc", pod="kiali-traffic-generator-gbcxr", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="kiali-traffic-generator", service_istio_io_canonical_revision="latest"}

envoy_cluster_upstream_rq_retry{app="reviews", cluster_name="xds-grpc", instance="10.217.0.159:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="reviews-v1-5fd6d4f8f8-t78f9", pod_template_hash="5fd6d4f8f8", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="reviews", service_istio_io_canonical_revision="v1", version="v1"}

envoy_cluster_upstream_rq_retry{app="ratings", cluster_name="xds-grpc", instance="10.217.0.158:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="ratings-v1-7c4bbf97db-2jk7h", pod_template_hash="7c4bbf97db", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="ratings", service_istio_io_canonical_revision="v1", version="v1"}

envoy_cluster_upstream_rq_retry{app="reviews", cluster_name="xds-grpc", instance="10.217.0.161:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="reviews-v3-7d99fd7978-lln4g", pod_template_hash="7d99fd7978", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="reviews", service_istio_io_canonical_revision="v3", version="v3"}

envoy_cluster_upstream_rq_retry{app="productpage", cluster_name="xds-grpc", instance="10.217.0.162:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="productpage-v1-87d54dd59-jprr2", pod_template_hash="87d54dd59", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="productpage", service_istio_io_canonical_revision="v1", version="v1"}

envoy_cluster_upstream_rq_retry{app="reviews", cluster_name="xds-grpc", instance="10.217.0.160:15020", job="kubernetes-pods", namespace="bookinfo", node="crc", pod="reviews-v2-6f9b55c5db-rkjct", pod_template_hash="6f9b55c5db", security_istio_io_tlsMode="istio", service_istio_io_canonical_name="reviews", service_istio_io_canonical_revision="v2", version="v2"}