Open wbpcode opened 2 years ago
I have an idea. We can generate more metrics with the metadata of route/host. And if we use typed metadata, we can also get a very good performance.
I would like to see if the community has any suggestions. If the community thinks this make sense, I will provide a more detailed doc to iterate this proposal and then try to land it. @mattklein123 @alyssawilk
cc @jmarantz for stats suggestions :-)
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
hi, @jmarantz could you please give some advice?
The main challenge with providing a lot more stats especially per-host is that some systems have a very large number of hosts and the weight of all these stats puts pressure on envoy memory use (~100 bytes per stat assuming names are symbolic), stats-sink overhead, and admin /stats CPU bursts (see description in https://github.com/envoyproxy/envoy/pull/19693 for microbenchmarks) along with pressure on browser memory if you access /stats via a browser.
My concern generally with Envoy stats we keep a ton of them that are written and never read.
My suggestion generally I think is to consider a holistic approach to configuring them, maybe even dynamically at runtime, e.g. an admin endpoint /stats/enable_host_detail=HOST_PATTERN or similar. WDYT?
@jmarantz Thanks.
My current design is to used typed metadata + L7 filters to generate some dynamic stats. Only when the metadata of specified namespace are configured then the related detailed stats will be generated. 🤔
I can provide a more detailed design later.
Sure, looking forward to the design and also exactly what you mean by 'dynamic' :)
@jmarantz Hi, sorry for the delayed updates. Here is a plain design proposal about the metadata stats. If anything is unclear, you can comment in the doc and I will update it.
https://docs.google.com/document/d/16Rh1S-sOg4cWyQNpewsrGNgEY9jBqiRY_1wf45AFc9M/edit?usp=sharing
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
Now Envoy provides users larger number of metrics of listener and upstream cluster. It helps a lot.
But we still want some more detailed request stats for route or host. For example,
5xx
number of/route1
,4xx
number of1.2.3.4:80
or attach some more labels from control plane to stats.Now we have
virtual cluster
which can provides some more stats. But It's hard to create avirtual cluster
for every route and thevirtual cluster
also need some external match which will degrade performance.