rackerlabs / genestack

Where Flex cloud brings infrastructures to where you are.
https://docs.rackspacecloud.com/
Apache License 2.0
29 stars 32 forks source link

Cluster name applied to kubernetes but not applied to any of the overrides #77

Open aedan opened 8 months ago

aedan commented 8 months ago

Describe the bug During kubespray run the cluster name is applied to kubernetes from inventory file, but when you attempt to deploy mysql and all openstack services they still refer to the cluster as cluster.local. Also there is no mention of the change in the wiki

To Reproduce Steps to reproduce the behavior:

  1. Deploy genestack. You will notice

Expected behavior All affected files need to have default cluster name changed to match the one in the inventory file.

Server (please complete the following information):

Additional context Here is a list of the files that have to be corrected:

the kustomize/mariadb-operator/kustomization.yaml - clusterName needs to be changed /opt/genestack/helm-configs/keystone/keystone-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/glance/glance-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/heat/heat-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/cinder/cinder-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/octavia/octavia-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/neutron/neutron-helm-overrides - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/nova/nova-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/placement/placement-helm-overrides.yaml - cluster_domain_suffix needs to be changed /opt/genestack/helm-configs/horizon/horizon-helm-overrides.yaml - cluster_domain_suffix needs to be changed

For skyline the secrets creation has to be corrected:

kubectl --namespace openstack \ create secret generic skyline-apiserver-secrets \ --type Opaque \ --from-literal=service-username="skyline" \ --from-literal=service-password="$(< /dev/urandom tr -dc _A-Za-z0-9 | head -c${1:-32};echo;)" \ --from-literal=service-domain="service" \ --from-literal=service-project="service" \ --from-literal=service-project-domain="service" \ --from-literal=db-endpoint="mariadb-galera-primary.openstack.svc.<__FIX_ME__>" \ --from-literal=db-name="skyline" \ --from-literal=db-username="skyline" \ --from-literal=db-password="$(< /dev/urandom tr -dc _A-Za-z0-9 | head -c${1:-32};echo;)" \ --from-literal=secret-key="$(< /dev/urandom tr -dc _A-Za-z0-9 | head -c${1:-32};echo;)" \ --from-literal=keystone-endpoint=" http://keystone-api.openstack.svc.<__FIX_ME__>:5000" \ --from-literal=default-region="RegionOne"

and then this file also needs to be correct for skyline:

/opt/genestack/kustomize/skyline/base/ingress-apiserver.yaml - host definition at bottom

LukeRepko commented 8 months ago

There are comments in some/all of the (ex: helm-configs/nova/nova-helm-overrides.yaml) that suggest cluster_domain_suffix can be overridden by env vars. At first glance, I don't see how/where that would be done. There is the k8s_cluster[vars][cluster_name] in the inventory, but I think that's only tangentially related. This might explain why cluster.local still exists in the SJC cluster despite our using specified FQDNs everywhere we thought we needed to.

awfabian-rs commented 8 months ago

I managed to deploy keystone on my test cluster as below. I copied the command from the genestack wiki, and de-indented the line I added:

helm upgrade --install keystone ./keystone \
    --namespace=openstack \
    --wait \
    --timeout 120m \
    -f /opt/genestack/helm-configs/keystone/keystone-helm-overrides.yaml \
--set endpoints.cluster_domain_suffix=testenv4.flex.awfabian.com \
    --set endpoints.identity.auth.admin.password="$(kubectl --namespace openstack get secret keystone-admin -o jsonpath='{.data.password}' | base64 -d)" \
    --set endpoints.oslo_db.auth.admin.password="$(kubectl --namespace openstack get secret mariadb -o jsonpath='{.data.root-password}' | base64 -d)" \
    --set endpoints.oslo_db.auth.keystone.password="$(kubectl --namespace openstack get secret keystone-db-password -o jsonpath='{.data.password}' | base64 -d)" \
    --set endpoints.oslo_messaging.auth.admin.password="$(kubectl --namespace openstack get secret rabbitmq-default-user -o jsonpath='{.data.password}' | base64 -d)" \
    --set endpoints.oslo_messaging.auth.keystone.password="$(kubectl --namespace openstack get secret keystone-rabbitmq-password -o jsonpath='{.data.password}' | base64 -d)" \
    --post-renderer /opt/genestack/kustomize/kustomize.sh \
    --post-renderer-args keystone/base

keystone failed without that line because I have a cluster name besides the default. The OpenStack charts generally all appear to have cluster_domain_suffix at endpoints.cluster_domain_suffix.

I think the wiki should probably apply a second override file on the command for each openstack chart deployment like keystone above like -f /opt/genestack/helm-configs/prod-example-openstack-overrides.yaml after the first -f file (so the second one would overwrite any values from the first), and then add default.cluster_domain_suffix: cluster.local and have that uncommented, and comment-out most of the lines in the example config file. Generally, I think I would like the documentation to have an -f override for this file everywhere it makes sense with as many reasonable defaults already un-commented as possible, and then have most of the file commented out so that it actually works.

We could also direct setting CLUSTER_NAME at the top of that wiki page and put --set endpoints.cluster_domain_suffix=${CLUSTER_NAME:-cluster.local} on each openstack chart deployment command, but that seems a little more hacky than doing -f and having a file with the cluster name in it, and none of the wiki uses environment variables in this way so far.