newrelic / nri-kubernetes

New Relic integration for Kubernetes
https://docs.newrelic.com/docs/integrations/kubernetes-integration/get-started/introduction-kubernetes-integration
Apache License 2.0
43 stars 51 forks source link

ControlPlane: collect admission controller duration statistics #820

Open Emberwalker opened 1 year ago

Emberwalker commented 1 year ago

Description

Currently the k8sApiServerSample does not include the API Server apiserver_admission_controller_admission_duration_seconds metric, which can be useful in diagnosing latency problems with the API Server that are caused by misbehaving admission controllers (where one blocks for an extended time on e.g. CREATE requests). Without these metrics, cluster administrators need to either also run a regular Prometheus instance to collect this data, or a custom OTel Collector in order to gather these stats. Would it be possible to get this metric included in the default sample?

Expected Behavior

The API Server apiserver_admission_controller_admission_duration_seconds metric is scraped and stored in the k8sApiServerSample as a new metric (e.g. k8s.apiserver.admission.duration)

Troubleshooting or NR Diag results

N/A

Steps to Reproduce

Compare reported metrics from the API Server metrics endpoint with metrics exposed under k8s.apiserver. in New Relic.

Your Environment

Kubernetes 1.23 and 1.24, NRI Bundle 4.3.1, NRI-Kubernetes 3.15.1

Additional context

This metric was referenced in issue #445 but was not added as a result of that ticket.

For Maintainers Only or Hero Triaging this bug

Suggested Priority (P1,P2,P3,P4,P5): Suggested T-Shirt size (S, M, L, XL, Unknown):

workato-integration[bot] commented 1 year ago

https://issues.newrelic.com/browse/NR-147305

mangulonr commented 11 months ago

Hi Ember Thanks for creating this request. It will be taken into account when prioritizing next projects.