envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.69k stars 4.75k forks source link

Redis endpoints success/error/timeout stats are zero #32653

Open zigmund opened 6 months ago

zigmund commented 6 months ago

Title: Redis endpoints success/error/timeout stats are zero

Description: There is new feature in v1.29.0 - per_endpoint_stats. I've enabled it with redis upstreams and see envoy_cluster_endpoint_rq_total increasing, but envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout are always zero. /clusters endpoint also shows zero values for these stats.

image

We're currently using envoy only with redis upstreams, so I don't know if is only redis related or not.

What issue is being seen? Describe what should be happening instead of

envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout counters increasing accordingly.

Repro steps: Enable in cluster with redis upstreams:

          track_cluster_stats:
            per_endpoint_stats: true

Make some requests. Observe only envoy_cluster_endpoint_rq_total increasing, but not the other rq stats.

Admin and Stats Output: /stats: https://gist.github.com/zigmund/642b32394ba87612188e9eff73c605b1 /clusters: https://gist.github.com/zigmund/9eec9c17e933fa52299539dede1f8be5 /routes: 404 Not Found /server_info: https://gist.github.com/zigmund/df51bd50b7a9267c7a4b2ac82712e70c

Config: https://gist.github.com/zigmund/b0194f01acb427933f8c7e9d8b3b9720

Logs: https://gist.github.com/zigmund/62063e1dff35af0afff585ea481c04cc

mattklein123 commented 6 months ago

Probably just not implemented for redis.

Pawan-Bishnoi commented 6 months ago

true, this doesn't look related to per_cluster_stats.

Even without that, I see the same issue:

redis_cluster::127.0.0.1:6379::cx_active::1
redis_cluster::127.0.0.1:6379::cx_connect_fail::0
redis_cluster::127.0.0.1:6379::cx_total::1
redis_cluster::127.0.0.1:6379::rq_active::0
redis_cluster::127.0.0.1:6379::rq_error::0
redis_cluster::127.0.0.1:6379::rq_success::0
redis_cluster::127.0.0.1:6379::rq_timeout::0
redis_cluster::127.0.0.1:6379::rq_total::3
pratyushprakash commented 6 months ago

Can I work on this? I see it marked as help wanted

miroswan commented 5 months ago

The referenced PR per_endpoint_stats has marked #21685 as closed. In that issue, @vandyvilla mentioned that they needed per-upstream-host stats specifically for Redis cluster. How could it be that this isn't implemented for Redis yet if the referenced PR closed a bug requesting these stats specifically for Redis? What outstanding work is projected to be required?

Pawan-Bishnoi commented 2 months ago

Can I work on this? I see it marked as help wanted

I think you can @pratyushprakash 😄