emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.32k stars 685 forks source link

Knative service names truncated in metrics #3600

Open sdhoward opened 2 years ago

sdhoward commented 2 years ago

Describe the bug

Emissary is truncating Envoy Proxy cluster names to 32 characters, which interferes with my ability to measure a service's requests per second, which is my goal.

Here are the metrics for two differently-named knative services whose names are longer than 32 characters. (I trimmed the unimportant labels.) How can I tell which one is which?

envoy_cluster_upstream_rq_xx{envoy_cluster_name="cluster_svc45678901234567890123456789012-0", envoy_response_code_class="2"} 6
envoy_cluster_upstream_rq_xx{envoy_cluster_name="cluster_svc45678901234567890123456789012-1", envoy_response_code_class="2"} 7

To Reproduce

  1. Install Knative and ambassador
  2. Create two knative services with similar names. The names should be begin with the same 32 characters, such as svc45678901234567890123456789012 and should be different after that.
  3. Issue requests to both services.
  4. Scrape the Emissary Ingress pod /metrics endpoint or view the envoy_cluster_upstream_rq_xx metric in Prometheus

Expected behavior

I would expect the envoy_cluster_name to be the exact knative service name, longer than 32 characters since that is allowed by knative, and not truncated or prepended with anything extraneous.

Versions (please complete the following information):

Additional context

Knative is passing the service names on to Ambassador without truncating them:

$ kubectl -n my-namespace get ingresses.networking.internal.knative.dev
NAME                                 READY   REASON
svc456789012345678901234567890123    True    
svc4567890123456789012345678901234   True    
alexanderGalushka commented 2 years ago

@khussey @kflynn can you please kindly look into this bug, thank you

Sushma10037017 commented 2 years ago

+1 We are facing a similar issue. The envoy_cluster_name metric value is getting truncated randomly. The -<namespace- are well within 100 characters, and we are using ambassador:1.13.10 version.

Sample metrics : here the ending characters of the namespace is getting truncated! envoy_cluster_upstream_rq_time_bucket{envoy_cluster_name="cluster_ums_authorization_1_umsp_in_qa_a-0",le="0.5"} 10 envoy_cluster_upstream_rq_time_bucket{envoy_cluster_name="cluster_subscription_service_read_1_hspa-0",le="100"} 3781 envoy_cluster_upstream_cx_length_ms_bucket{envoy_cluster_name="cluster_ratelimit_default_ambassador_in_-0",le="0.5"} 0

aallbrig commented 1 year ago

+1 My team and I are seeing this in ambassador version 1.14.3 after an upgrade from a pre-1.x version of ambassador.

Similar to OP we also use telemetry provided by ambassador to get insight into each service's 2x, 3x, 4x, and 5x statuses.

So far the advise I've received is to tune our Prometheus queries using regex to account for the service name truncation. Ideally, there is a control we can toggle to stop the truncation.