canonical / charm-openstack-service-checks

Collection of Nagios checks and other utilities that can be used to verify the operation of an OpenStack cluster
0 stars 2 forks source link

OSC shows ssl error with relations to keystone on identity-credentials #63

Closed sudeephb closed 7 months ago

sudeephb commented 7 months ago

during deployments of openstack with OSC, after adding OSC, the charm goes into a blocked state with an ssl error connecting to keystone.

our LMA bundle has these relations to keystone:

but the charm still shows: 2021-04-25 03:59:29 ERROR juju-log Failed to create endpoint checks due issue communicating with Keystone. Error: Keystone ssl error when listing SSL exception connecting to https://keystone-internal.prodymcprodface.solutionsqa:35357/v3/auth/tokens: HTTPSConnectionPool(host='keystone-internal.prodymcprodface.solutionsqa', port=35357): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)'))): endpoints

The testrun artifacts can be found at: https://oil-jenkins.canonical.com/artifacts/20bb9614-219b-432e-9107-ebfa2a744af7/index.html the crashdump with OSC can be found at: https://oil-jenkins.canonical.com/artifacts/20bb9614-219b-432e-9107-ebfa2a744af7/generated/generated/lmacmr/juju-crashdump-openstack-2021-04-25-03.59.27.tar.gz the bundle deploying OSC can be found at: https://oil-jenkins.canonical.com/artifacts/20bb9614-219b-432e-9107-ebfa2a744af7/config/config/overlay_openstack-saas.yaml


Imported from Launchpad using lp2gh.

sudeephb commented 7 months ago

(by alexdodson) This happens for me on cs:~llama-charmers-next/openstack-service-checks-10 but not cs:~llama-charmers-next/openstack-service-checks-9 .

I was able to source the nagios.novarc from /var/lib/nagios and query the endpoints fine though in version 10 which showed this error.

sudeephb commented 7 months ago

(by valexby) So the bundles referred to in the bug are not available anymore. But I was able to reproduce:

  1. Generated model via OpenStack model generation script[1]
  2. Deployed OpenStack bundle[2], wait until stabilized, unsealed vault
  3. juju deploy cs:~canonical-bootstack/openstack-service-checks
  4. juju deploy nagios
  5. juju deploy nrpe
  6. juju add-relation ceph-osd nrpe
  7. juju add-relation nrpe:monitors nagios:monitors
  8. juju add-relation openstack-service-checks nrpe
  9. juju add-relation openstack-service-checks:identity-credentials keystone:identity-credentials

And got the same error as reported https://pastebin.canonical.com/p/MMR2PT4v3q/

[1] https://github.com/openstack-charmers/charm-test-infra/blob/master/juju-openstack-controller-example.sh [2] https://github.com/openstack-charmers/openstack-bundles.git

Going to look into bug's cause

sudeephb commented 7 months ago

(by valexby) So basically there are two options on how to authorize the openstack-service-check to use OpenStack API: user password authentication, and relation to keystone:identity-credentials. If the second option is chosen then the tls_certificate need to be provided like:

juju run-action  \
    --wait vault/0 get-root-ca --format json \
    | jq -r '."unit-vault-0".results.output' \
    | base64 -w 0 \
    | xargs -I {} juju config openstack-service-checks trusted_ssl_ca={}

So it is expected behavior that openstack-service-checks fails to error state waiting for tls certificate provided. I going to open a "wishlist" bug to implement the "certificates" interface for openstack-service-checks charm, so it will be possible just add-relation with vault without manual configuration. Alexander, do you think there is something else we could do in this bug besides the workaround provided above? Many thanks in advance!

Best Regards, Alex.

sudeephb commented 7 months ago

(by valexby) The "Whishlist" bug to address the root cause of the issue is here: https://bugs.launchpad.net/charm-openstack-service-checks/+bug/1999507

Best, Alex.