canonical / charm-openstack-service-checks

Collection of Nagios checks and other utilities that can be used to verify the operation of an OpenStack cluster
0 stars 2 forks source link

Add check which audits for Allocations against resource providers which are invalid #66

Closed sudeephb closed 7 months ago

sudeephb commented 7 months ago

Due to various failures of VM deployments and migrations, we sometimes have stale resource allocations against resource providers in the placement/nova_api database which cause "ghost" contention when scheduling resources.

This charm should audit all allocations against each resource provider and ensure that the allocations' instance is "not deleted or shelved" and "exists on the hypervisor matching the resource provider".


Imported from Launchpad using lp2gh.

sudeephb commented 7 months ago

(by afreiberger) One quick-win check for this could be ensuring allocation count for a given provider matches the number of vms/servers scheduled to the host:

$ os resource provider show $server1_resource_provider_uuid --allocations --format yaml | grep resources | wc -l 37 $ os server list --all --host $server1_fqdn -fvalue | wc -l 28