envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.99k stars 4.81k forks source link

stats for route/virtual host/host or any resource with metadata #19663

Open wbpcode opened 2 years ago

wbpcode commented 2 years ago

Now Envoy provides users larger number of metrics of listener and upstream cluster. It helps a lot.

But we still want some more detailed request stats for route or host. For example, 5xx number of /route1, 4xx number of 1.2.3.4:80 or attach some more labels from control plane to stats.

Now we have virtual cluster which can provides some more stats. But It's hard to create a virtual cluster for every route and the virtual cluster also need some external match which will degrade performance.

wbpcode commented 2 years ago

I have an idea. We can generate more metrics with the metadata of route/host. And if we use typed metadata, we can also get a very good performance.

I would like to see if the community has any suggestions. If the community thinks this make sense, I will provide a more detailed doc to iterate this proposal and then try to land it. @mattklein123 @alyssawilk

alyssawilk commented 2 years ago

cc @jmarantz for stats suggestions :-)

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

wbpcode commented 2 years ago

hi, @jmarantz could you please give some advice?

jmarantz commented 2 years ago

The main challenge with providing a lot more stats especially per-host is that some systems have a very large number of hosts and the weight of all these stats puts pressure on envoy memory use (~100 bytes per stat assuming names are symbolic), stats-sink overhead, and admin /stats CPU bursts (see description in https://github.com/envoyproxy/envoy/pull/19693 for microbenchmarks) along with pressure on browser memory if you access /stats via a browser.

My concern generally with Envoy stats we keep a ton of them that are written and never read.

My suggestion generally I think is to consider a holistic approach to configuring them, maybe even dynamically at runtime, e.g. an admin endpoint /stats/enable_host_detail=HOST_PATTERN or similar. WDYT?

wbpcode commented 2 years ago

@jmarantz Thanks.

My current design is to used typed metadata + L7 filters to generate some dynamic stats. Only when the metadata of specified namespace are configured then the related detailed stats will be generated. 🤔

I can provide a more detailed design later.

jmarantz commented 2 years ago

Sure, looking forward to the design and also exactly what you mean by 'dynamic' :)

wbpcode commented 2 years ago

@jmarantz Hi, sorry for the delayed updates. Here is a plain design proposal about the metadata stats. If anything is unclear, you can comment in the doc and I will update it.

https://docs.google.com/document/d/16Rh1S-sOg4cWyQNpewsrGNgEY9jBqiRY_1wf45AFc9M/edit?usp=sharing

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.