deckhouse / deckhouse

Kubernetes platform from Flant
https://deckhouse.io
Other
1.1k stars 116 forks source link

[monitoring] Make robust cloud providers monitoring #3502

Open EvgenySamoylov opened 1 year ago

EvgenySamoylov commented 1 year ago

Preflight Checklist

Use case. Why is this important?

We have the number of known issues with the lack of monitoring of the cloud providers, which is the core part of the system. They should be addressed in this epic.

Proposed Solution

Additional Information

No response

nabokihms commented 1 year ago

The plan is:

  1. Add pod/service monitors to collect metrics of cloud-controller-managers and csi-drivers.
  2. Add alerts on these metrics about reconciliation loop errors or errors on connection to a cloud.