envoyproxy / gateway

Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
https://gateway.envoyproxy.io
Apache License 2.0
1.54k stars 328 forks source link

Support for enabling Request Response Size statistics #4234

Open luvk1412 opened 5 days ago

luvk1412 commented 5 days ago

I want to have plots of Request Response Size statistics in my eg setup which requires me to enable request_response_sizes in config.cluster.v3.TrackClusterStats.

In my knowledge there is currently no api in eg to do so(please correct me if am wrong) nor it is practically feasible to enable this using EnvoyPatchPolicy as we have lots of clusters and there is no way to select them all based on regex or something else, and adding for each cluster one by one is not feasible.

Request to add support for this. BackendTrafficPolicy seems a good place to do so to me maybe ?

arkodg commented 5 days ago

could be made opt in using https://gateway.envoyproxy.io/docs/api/extension_types/#proxymetrics

luvk1412 commented 4 days ago

somehow i have missed this api in eg as we are not using this api as well. Thanks @arkodg for pointing this out. ProxyMetrics does seems like a good place to make this as opt in. When we add it is as field here, it would enable it for all clusters and users won't be able to turn it on for specific clusters right ? I am assuming same is the case for enablePerEndpointStats as of today as it is also part of config.cluster.v3.TrackClusterStats , and if mentioned true, it gets enabled for all clusters with no option to enable for specific clusters ? I am fine with this behaviour for at least my use case but just want to clarify.

Also I am a little confused by the ProxyMetrics api:

[OFF TOPIC] A side note suggestion for documentation if it seems helpful to you guys: It would be helpful if we start documenting what corresponding field in envoy proxy, does a specific field in eg api corresponds to. This will be really helpful for people who want to deep dive a little or understand it a bit more. I personally have been reffering to eg code to find this out. Most recent example was where i wanted understand what HTTPRoute's timeouts corresponds to as there are two. If this seems reasonable and worth talking about, i can open a separate issue for this as well.