ansible / awx-operator

An Ansible AWX operator for Kubernetes built with Operator SDK and Ansible. 🤖
https://www.github.com/ansible/awx
Apache License 2.0
1.26k stars 633 forks source link

Config changes from AWX Custom Resource do not translate to AWX Deployment #1275

Open danielmorillas opened 1 year ago

danielmorillas commented 1 year ago

Please confirm the following

Bug Summary

This is a continuation of the proposed bug https://github.com/ansible/awx-operator/issues/1239. I tested the new version v1.3.0 following the same steps and same result.

In the end, helm upgrade is not detecting any update in AWX kind object and they are only detecting if you manually delete the awx deployment that then respin up the awx pod with the new changes. This must be done automatically if there are changes in the AWX object.

I do not add more info because it is exactly the same issue as in the above link. I am opening a new issue as @rooftopcellist suggested to me in the case that their patch did not work.

AWX Operator version

1.3.0

AWX version

21.13.0

Kubernetes platform

kubernetes

Kubernetes/Platform version

1.23 kind v0.11.0 go1.16.4 linux/amd64

Modifications

no

Steps to reproduce

Same than in https://github.com/ansible/awx-operator/issues/1239

Expected results

same than in https://github.com/ansible/awx-operator/issues/1239

Actual results

Same than in https://github.com/ansible/awx-operator/issues/1239

Additional information

https://github.com/ansible/awx-operator/issues/1239

Operator Logs

https://github.com/ansible/awx-operator/issues/1239

rooftopcellist commented 1 year ago

@danielmorillas I tested this out locally deploying with the 1.2.0 awx-operator helm chart, then upgrading to 1.3.0, and it looks like the AWX object's extra_settings does get updated. Here are the steps I took. Note that AWX.enabled must be set to true or else it's not going to be templated (I ran into that headache...).

Create a myvalues.yaml file:

$ cat my-values.yml 
AWX: 
  # enable use of awx-deploy template
  enabled: true
  name: awx
  spec:
    admin_user: admin
    extra_settings:
    - setting: MAX_PAGE_SIZE
      value: "222"

Install an older version of the awx-operator:

helm install my-awx-operator awx-operator/awx-operator -n awx --create-namespace -f myvalues.yaml --version 1.2.0

Wait for AWX to fully deploy.

Check to see that MAX_PAGE_SIZE is set correctly on the AWX CR's spec.

$ oc get awx -o yaml | grep MAX_PAGE_SIZE -A1
    - setting: MAX_PAGE_SIZE
      value: "222"

Modify myvalues.yaml so the MAX_PAGE_SIZE setting has a value of "333"

helm upgrade my-awx-operator awx-operator/awx-operator -n awx --create-namespace -f my-values.yml --version 1.3.0

Now check to see that the value was updated on the spec:

$ oc get awx -o yaml | grep MAX_PAGE_SIZE -A1
    - setting: MAX_PAGE_SIZE
      value: "333"
danielmorillas commented 1 year ago

@danielmorillas I tested this out locally deploying with the 1.2.0 awx-operator helm chart, then upgrading to 1.3.0, and it looks like the AWX object's extra_settings does get updated. Here are the steps I took. Note that AWX.enabled must be set to true or else it's not going to be templated (I ran into that headache...).

Create a myvalues.yaml file:

$ cat my-values.yml 
AWX: 
  # enable use of awx-deploy template
  enabled: true
  name: awx
  spec:
    admin_user: admin
    extra_settings:
    - setting: MAX_PAGE_SIZE
      value: "222"

Install an older version of the awx-operator:

helm install my-awx-operator awx-operator/awx-operator -n awx --create-namespace -f myvalues.yaml --version 1.2.0

Wait for AWX to fully deploy.

Check to see that MAX_PAGE_SIZE is set correctly on the AWX CR's spec.

$ oc get awx -o yaml | grep MAX_PAGE_SIZE -A1
    - setting: MAX_PAGE_SIZE
      value: "222"

Modify myvalues.yaml so the MAX_PAGE_SIZE setting has a value of "333"

helm upgrade my-awx-operator awx-operator/awx-operator -n awx --create-namespace -f my-values.yml --version 1.3.0

Now check to see that the value was updated on the spec:

$ oc get awx -o yaml | grep MAX_PAGE_SIZE -A1
    - setting: MAX_PAGE_SIZE
      value: "333"

Hey,

For sure AWX object is updated, if you execute helm upgrade with "--debug" option you will see that at the end AWX object will include the new changes, but there's no template updated by helm upgrade (--debug will show you)

Instead of that change, I propose you test (it could be good any change that then you could see in the UI) with adding in values.yaml the following parameters: image (quay.io/ansible/awx) and image_version (21.12.0) instead of by default image (21.13.0), your deployment won't detect that change (until you manually delete the deployment and it will respin up a new pod with the change). This behavior should be done automatically by helm upgrade.

Note: If you don't add "image" parameter (instead of being the default value), "image_version" parameter won't be updated (I tested some time ago and I needed to dig into awx templates and need both parameters in values.yaml)

Same case updating for i.e LDAP values, AWX object is updated but not its configmap and no change is added to AWX instance.

Note: I run this deployment in a GitOps environment using fluxCD, so I noticed after committing that no changes that I wanted to update were applied and after digging into it I realized that the problem was that the helm upgrade command that executed fluxCD to apply changes, for that reason I went through helm upgrade directly to see what's happened with it. Helm upgrade checks if there are changes in its templates. For example:

root@dmorillas-dev:~/dmorillas/awx-iac-gitops# helm upgrade awx -n dev-ops-playground helm-chart-awx-operator/ -f helm-chart-awx-operator/values.yaml --debug
upgrade.go:142: [debug] preparing upgrade for awx
upgrade.go:150: [debug] performing update for awx
upgrade.go:322: [debug] creating upgraded release for awx
client.go:218: [debug] checking 12 resources for changes
client.go:501: [debug] Looks like there are no changes for ServiceAccount "awx-operator-controller-manager"
client.go:501: [debug] Looks like there are no changes for ConfigMap "awx-operator-awx-manager-config"
client.go:501: [debug] Looks like there are no changes for ClusterRole "awx-operator-metrics-reader"
client.go:501: [debug] Looks like there are no changes for ClusterRole "awx-operator-proxy-role"
client.go:501: [debug] Looks like there are no changes for ClusterRoleBinding "awx-operator-proxy-rolebinding"
client.go:510: [debug] Patch Role "awx-operator-awx-manager-role" in namespace dev-ops-playground
client.go:501: [debug] Looks like there are no changes for Role "awx-operator-leader-election-role"
client.go:501: [debug] Looks like there are no changes for RoleBinding "awx-operator-awx-manager-rolebinding"
client.go:501: [debug] Looks like there are no changes for RoleBinding "awx-operator-leader-election-rolebinding"
client.go:501: [debug] Looks like there are no changes for Service "awx-operator-controller-manager-metrics-service"
client.go:510: [debug] Patch Deployment "awx-operator-controller-manager" in namespace dev-ops-playground
client.go:510: [debug] Patch AWX "awx" in namespace dev-ops-playground
upgrade.go:157: [debug] updating status for upgraded release for awx
Release "awx" has been upgraded. Happy Helming!
...
#At the end of this output using --debug option you will see how was updated the AWX object

So it is checking for 12 resources changes... If I check the templates I see 13 templates (awx-deploy.yaml is the only one that it is not being checked above - in helm upgrade command-, could it be the issue?):

root@dmorillas-dev:~/dmorillas/awx-iac-gitops/helm-chart-awx-operator/templates# ls -l
total 60
-rw-r--r-- 1 root root   53 Mar  6 19:51 NOTES.txt
-rw-r--r-- 1 root root  230 Mar  6 19:51 _helpers.tpl
-rw-r--r-- 1 root root  756 Mar  6 19:51 awx-deploy.yaml
-rw-r--r-- 1 root root  215 Mar  6 19:51 clusterrole-awx-operator-metrics-reader.yaml
-rw-r--r-- 1 root root  371 Mar  6 19:51 clusterrole-awx-operator-proxy-role.yaml
-rw-r--r-- 1 root root  375 Mar  6 19:51 clusterrolebinding-awx-operator-proxy-rolebinding.yaml
-rw-r--r-- 1 root root 1138 Mar  6 19:51 configmap-awx-operator-awx-manager-config.yaml
-rw-r--r-- 1 root root 2529 Mar  7 08:09 deployment-awx-operator-controller-manager.yaml
-rw-r--r-- 1 root root  445 Mar  6 19:51 postgres-config.yaml
-rw-r--r-- 1 root root 1805 Mar  6 19:51 role-awx-operator-awx-manager-role.yaml
-rw-r--r-- 1 root root  600 Mar  6 19:51 role-awx-operator-leader-election-role.yaml
-rw-r--r-- 1 root root  373 Mar  6 19:51 rolebinding-awx-operator-awx-manager-rolebinding.yaml
-rw-r--r-- 1 root root  381 Mar  6 19:51 rolebinding-awx-operator-leader-election-rolebinding.yaml
-rw-r--r-- 1 root root  351 Mar  6 19:51 service-awx-operator-controller-manager-metrics-service.yaml
-rw-r--r-- 1 root root  128 Mar  6 19:51 serviceaccount-awx-operator-controller-manager.yaml

So AWX object is updated but this is not triggering any change in the current AWX instance deployed.

I hope this clarifies my issue :)

erlisb commented 1 year ago

@danielmorillas, @rooftopcellist this issue still persists on an Openshift environment, with the AWX Operator installed through Operator Hub.

I am trying to add new organizations using the following LDAP config, cm is updated but new Deployment is not rolled out :

    - setting: AUTH_LDAP_ORGANIZATION_MAP
      value: {
        "ORG1": {
          "users": ["CN=XXXXX,OU=APP,OU=Administration,OU=XXXX,OU=Company,DC=xxx,DC=xxx,DC=xx"],
          "admins": ["CN=XXXXX,OU=APP,OU=Administration,OU=XXXX,OU=Company,DC=xxx,DC=xxx,DC=xx"],
          "remove_admins": true
        }
      }
miles-w-3 commented 1 year ago

I think it's important to separate out helm's role here versus the operator itself. The helm awx-deploy template is responsible only for creating the actual resource, it doesn't do anything special to update the Deployment which is managed by the controller. I'm not sure about your second output where helm isn't explicitly listing the awx-deploy resource, but am I correct that you and @rooftopcellist agree that the AWX object itself is updated, even if you don't see the changes the propagating to the resources managed by the operator. Is that correct?

danielmorillas commented 1 year ago

I think it's important to separate out helm's role here versus the operator itself. The helm awx-deploy template is responsible only for creating the actual resource, it doesn't do anything special to update the Deployment which is managed by the controller. I'm not sure about your second output where helm isn't explicitly listing the awx-deploy resource, but am I correct that you and @rooftopcellist agree that the AWX object itself is updated, even if you don't see the changes the propagating to the resources managed by the operator. Is that correct?

Hey, yeah you are right. I think it is updated as expected but there is no rollout from the controller and I guess it should be once AWX resource is updated. The second output only shows the current templates for the helm chart.

2and3makes23 commented 1 year ago

We have the same issue, running Operator 1.4.0 in a Kubernetes 1.24.6 cluster

Rename issue / Open new issue?

We are using kustomize instead of helm, so I would assume, it has nothing to do with either of them. Maybe it would make sense renaming this issue? (or should I open a new one?)

Reproduce problem

At the moment we deploy our AWX instances (running in an OpenShift cluster) via

oc kustomize our_awx_configs/some_env | oc apply -f -

All resources are deployed as expected. When updating values like

spec:
  redis_resource_requirements:
    limits:
      cpu: 107m # new incremented value
      memory: 208Mi # new incremented value

and repeating our ... | oc apply -f - we noticed that those new values are updated within the awx custom resource, but are not propagated to the deployment resource:

# Custom resource is configured properly
$ oc get awx awx -oyaml | grep -E '(106m|107m|207Mi|208Mi)$'
      cpu: 107m
      memory: 208Mi
# Deployment does not get incremented values
$ oc get deployment awx -oyaml | grep -E '(106m|107m|207Mi|208Mi)$'
            cpu: 106m
            memory: 207Mi

As long as the deployment resource remains untouched by our settings update, of course there is no trigger for a rolling update of pods.

Our messy workaround

We have been relying on forcing the recreation of most resources to make sure, every new setting really is propagated to our AWX pods. For obvious reasons we want to switch to a more kubernetes way of updating resource properties (Rolling update strategy). So we are very interested in a solution and are happy to provide more info if needed.

2and3makes23 commented 1 year ago

@danielmorillas thanks for your thumbsup - if you agree, could you rename this issue to 'Config changes from AWX Custom Resource do not translate to AWX Deployment' (or something similar), so the title really says what this is about? :)

danielmorillas commented 1 year ago

Hey,

Is there any news about this issue or does anyone know if it is gonna be something that it should add?

danielmorillas commented 1 year ago

Hey,

Is anyone working on this? We are stuck due to this thing, would be great to know future plans

Thanks!

2and3makes23 commented 1 year ago

Hey,

Is anyone working on this? We are stuck due to this thing, would be great to know future plans

Thanks!

We are still hoping to get this feature as well. :-/