Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.93k stars 297 forks source link

[Feature] Allow monitoring of Calico service from Prometheus #3703

Open Sturgelose opened 1 year ago

Sturgelose commented 1 year ago

Is your feature request related to a problem? Please describe.

Monitoring calico and calico-typha is a must, as it is a critical service that if it goes down, it will take down the networking of all the cluster.

As we can see, the monitoring of the service is disabled by default, and we cannot edit the configuration of the tigera operator or add annotations so prometheus or any other monitoring service can fetch metrics. Any edition to the operator will rollback its changes ASAP.

Describe the solution you'd like

It would be really useful to be able to configure some basic aspects of the calico operator (aside of memory and CPU), such as annotations or if we can enable monitoring.

Describe alternatives you've considered

Currently we can only fetch calico's logs, but that isn't enough observability compared to the metrics that calico exposes in its metrics endpoint.

Otherwise, at least it would be good these metrics to be exposed in the AKS Azure service or somehow viewed.

Alternatively, we could also install calico ourselves, but it is pretty awkward to have to install it ourselves in order to do basic monitoring of a critical service, specially if Azure provides a managed solution/product for customers.

grzesuav commented 5 months ago

I see some options regarding Calico here https://learn.microsoft.com/en-us/azure/aks/monitor-control-plane-metrics