canonical / charm-openstack-service-checks

Collection of Nagios checks and other utilities that can be used to verify the operation of an OpenStack cluster
0 stars 2 forks source link

check_cinder_services check fails when trying to create "volume" client #149

Closed sudeephb closed 6 months ago

sudeephb commented 6 months ago

Hello,

We upgraded the openstack-service-checks charm this night, and check_cinder_services started to fail with: """ $ sudo python3 /usr/local/lib/nagios/plugins/check_cinder_services.py <function check_cinder_services at 0x7fd93a857620> raised unknown exception '<class 'keystoneauth1.exceptions.catalog.EndpointNotFound'>'

Traceback (most recent call last): File "/usr/local/lib/nagios/plugins/nagios_plugin3.py", line 37, in try_check function(*args, kwargs) File "/usr/local/lib/nagios/plugins/check_cinder_services.py", line 35, in check_cinder_services services = cinder.get("/os-services").json()["services"] File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 386, in get return self.request(url, 'GET', kwargs) File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request return self.session.request(url, method, kwargs) File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 794, in request endpoint_filter) File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1225, in get_endpoint return auth.get_endpoint(self, kwargs) File "/usr/lib/python3/dist-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint allow_version_hack=allow_version_hack, kwargs) File "/usr/lib/python3/dist-packages/keystoneauth1/identity/base.py", line 279, in get_endpoint_data service_name=service_name) File "/usr/lib/python3/dist-packages/keystoneauth1/access/service_catalog.py", line 462, in endpoint_data_for raise exceptions.EndpointNotFound(msg) keystoneauth1.exceptions.catalog.EndpointNotFound: public endpoint for volume service in TRA region not found

"""

In the output of the "OpenStack catalog list" cinder is presented with type "volumev3": """ $ os catalog show cinderv3 | grep type | type | volumev3 """

So if to change "volume" on "volumev3" in the check code it works OK: """ $ sudo diff --suppress-common-lines -y /usr/local/lib/nagios/plugins/check_cinder_services.py /usr/local/lib/nagios/plugins/check_cinder_services.py.bak cinder = os_client_config.session_client("volume", cloud= | cinder = os_client_config.session_client("volumev3", clou $ sudo python3 /usr/local/lib/nagios/plugins/check_cinder_services.py.bak OK: All cinder services happy """ On other clouds where Cinder is registered in OpenStack catalog under "volumev3" type, this check works ok. stable 9 - works stable 4 - works stable 22 - fails

Please tell if I could provide more info for you.

Best Regards, Alex.


Imported from Launchpad using lp2gh.

sudeephb commented 6 months ago

(by przemeklal) Looked into this and in the affected environment in /var/lib/nagios/nagios.novarc this was: OS_VOLUME_API_VERSION=

We should investigate why /var/lib/nagios/nagios.novarc wasn't rendered properly: ./templates/nagios.novarc:7:export OS_VOLUME_API_VERSION={{ volume_api_version }}

sudeephb commented 6 months ago

(by przemeklal) Another workaround is to manually set OS_VOLUME_API_VERSION=3 in /var/lib/nagios/nagios.novarc.

sudeephb commented 6 months ago

(by valexby) Hello,

Tested this on OpenStack Yoga with osc revision 22. The template haven't provided value for OS_VOLUME_API_VERSION= but the check_cinder_services works fine.

Best Regards, Alex.

sudeephb commented 6 months ago

(by valexby) Tried like this:

  1. Generated model via OpenStack model generation script[1]
  2. Deployed OpenStack bundle[2], wait until stabilized, unsealed vault
  3. juju deploy --revision=22 --channel=stable/latest openstack-service-checks
  4. juju deploy nagios
  5. juju deploy nrpe
  6. juju add-relation ceph-osd nrpe
  7. juju add-relation nrpe:monitors nagios:monitors
  8. juju add-relation openstack-service-checks nrpe
  9. juju add-relation openstack-service-checks:identity-credentials keystone:identity-credentials
  10. juju run-action \ --wait vault/0 get-root-ca --format json \ | jq -r '."unit-vault-0".results.output' \ | base64 -w 0 \ | xargs -I {} juju config openstack-service-checkes trusted_ssl_ca={}
  11. Logged into ocs unit and run $ sudo python3 /usr/local/lib/nagios/plugins/check_cinder_services.py

[1] https://github.com/openstack-charmers/charm-test-infra/blob/master/juju-openstack-controller-example.sh [2] https://github.com/openstack-charmers/openstack-bundles.git

sudeephb commented 6 months ago

(by valexby) Hello, I was able to reproduce almost the same way as in steps above, only I used OpenStack of boinic-queens version and the osc of bionic series. Probably osc of bionic series should be enough. I used this bundle to deploy OpenStack, only fixed URLs from "cs:~openstack-charmers-next" to charmhub by removing all "cs:~openstack-charmers-next" from the bundle config.

sudeephb commented 6 months ago

(by valexby) So I was able to reproduce on stable openstack bundle with "bionic" osc

  1. Generated model via OpenStack model generation script[1]
  2. Deployed OpenStack bundle[2], wait until stabilized, unsealed vault
  3. juju deploy --revision=22 --series="bionic" --channel=stable/latest openstack-service-checks
  4. juju deploy nagios
  5. juju deploy nrpe
  6. juju add-relation ceph-osd nrpe
  7. juju add-relation nrpe:monitors nagios:monitors
  8. juju add-relation openstack-service-checks nrpe
  9. juju add-relation openstack-service-checks:identity-credentials keystone:identity-credentials
  10. juju run-action \ --wait vault/0 get-root-ca --format json \ | jq -r '."unit-vault-0".results.output' \ | base64 -w 0 \ | xargs -I {} juju config openstack-service-checks trusted_ssl_ca={}
  11. Logged into ocs unit and run $ sudo python3 /usr/local/lib/nagios/plugins/check_cinder_services.py

[1] https://github.com/openstack-charmers/charm-test-infra/blob/master/juju-openstack-controller-example.sh [2] https://github.com/openstack-charmers/openstack-bundles.git

sudeephb commented 6 months ago

(by dash3) Hello,

root@juju-b55047-3-lxd-10:~# sudo python3 /usr/local/lib/nagios/plugins/check_cinder_services.py <function check_cinder_services at 0x7f656a8bcb80> raised unknown exception '<class 'keystoneauth1.exceptions.catalog.EndpointNotFound'>'

Traceback (most recent call last): File "/usr/local/lib/nagios/plugins/nagios_plugin3.py", line 37, in try_check function(*args, kwargs) File "/usr/local/lib/nagios/plugins/check_cinder_services.py", line 35, in check_cinder_services services = cinder.get("/os-services").json()["services"] File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 395, in get return self.request(url, 'GET', kwargs) File "/usr/lib/python3/dist-packages/openstack/proxy.py", line 97, in request response = super(Proxy, self).request( File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 257, in request return self.session.request(url, method, **kwargs) File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 815, in request raise exceptions.EndpointNotFound() keystoneauth1.exceptions.catalog.EndpointNotFound: Could not find requested endpoint in Service Catalog.

Workaround is to manually add export OS_VOLUME_API_VERSION=3 to /var/lib/nagios/nagios.novarc

sudeephb commented 6 months ago

(by gabrielcocenza) Interesting.

Can you provide the logs of the charm? Thanks

sudeephb commented 6 months ago

(by tyldum) Hitting same as #7, revision 42:

Done: juju config openstack-service-checks "os-credentials=volume_api_version=3" && check_cinder_service.py Result: keystoneauth1.exceptions.catalog.EndpointNotFound: Could not find requested endpoint in Service Catalog.

Done: manually set OS_VOLUME_API_VERSION=3 && reboot && check_cinder_service.py result: Change reverted by charm upon reboot, no change: keystoneauth1.exceptions.catalog.EndpointNotFound: Could not find requested endpoint in Service Catalog.

Done: manually set OS_VOLUME_API_VERSION=3 && check_cinder_service.py (no reboot) Result: OK: All cinder services happy

Charms logs only indicate: 2023-09-22 11:25:08 WARNING unit.openstack-service-checks/0.config-changed logger.go:60 warnings.warn('Using keystoneclient sessions has been deprecated. ' 2023-09-22 11:25:08 WARNING unit.openstack-service-checks/0.config-changed logger.go:60 /usr/lib/python3/dist-packages/keystoneauth1/adapter.py:235: UserWarning: Using keystoneclient sessions has been deprecated. Please update your software to use keystoneauth1.