canonical / charm-openstack-service-checks

Collection of Nagios checks and other utilities that can be used to verify the operation of an OpenStack cluster
0 stars 2 forks source link

octavia loadbalancer check is overly sensitive; k8s LBs will always trigger alert #127

Closed sudeephb closed 6 months ago

sudeephb commented 6 months ago

In a Charmed Kubernetes environment deployed on Openstack, loadbalancers are not created with health monitoring [0]. While the Provisioning Status for new K8s LBs is Active, the lack of health monitoring causes the Operating Status to be Offline. Therefore, all K8s LBs will trigger the alert for the octavia_loadbalancers check. To deal with this today, we pretty much just add all K8s LBs to the ignore list. However, this is not sustainable, especially when the K8s cluster is heavily used and leads to a lot of noisy alerts if K8s LBs are being constantly created/deleted.

I'm not sure on the best way to address this problem, but here are some ideas of implementation:

[0] LP#1853668


Imported from Launchpad using lp2gh.

sudeephb commented 6 months ago

(by peppepetra) Also charm alerts for all lbs with operating status != "ONLINE" while "DRAINING", "NO_MONITOR" should be considered ok as well