[X] I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.
Bug Summary
New deployment of ver 2.16.1 using kustomize on existing cluster on EKS.
Same exact deployment with ver 2.10.0 works perfect!
The deployment is stuck with awx-web CrushLoopBackOff.
$ kubectl describe pods -n awx awx-dev-web-c48c45544-ffqkw
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned awx/awx-dev-web-c48c45544-ffqkw to ip-10-167-0-76.ec2.internal
Normal Pulled 23m kubelet Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
Normal Created 23m kubelet Created container init
Normal Started 23m kubelet Started container init
Normal Pulled 23m kubelet Container image "quay.io/centos/centos:stream9" already present on machine
Normal Created 23m kubelet Created container init-projects
Normal Started 23m kubelet Started container init-projects
Normal Created 23m kubelet Created container redis
Normal Pulled 23m kubelet Container image "docker.io/redis:7" already present on machine
Normal Started 23m kubelet Started container redis
Normal Pulled 23m kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
Normal Created 23m kubelet Created container awx-dev-rsyslog
Normal Started 23m kubelet Started container awx-dev-rsyslog
Normal Created 22m (x3 over 23m) kubelet Created container awx-dev-web
Normal Started 22m (x3 over 23m) kubelet Started container awx-dev-web
Normal Pulled 21m (x4 over 23m) kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
Warning BackOff 3m35s (x75 over 22m) kubelet Back-off restarting failed container awx-dev-web in pod awx-dev-web-c48c45544-ffqkw_awx(6bf702c0-0617-48ed-b3dc-a9adb1d2ff46)
In operator logs I get this message:
...
TASK [installer : Get the new resource pod information after updating resource.] ***
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:258
skipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}
TASK [installer : Update new resource pod as a variable.] **********************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:275
skipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}
TASK [installer : Update new resource pod name as a variable.] *****************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:283
skipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}
TASK [installer : Verify the resource pod name is populated.] ******************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:289
ok: [localhost] => {
\"changed\": false,
\"msg\": \"All assertions passed\"
}
TASK [installer : Migrate database to the latest schema] ***********************
task path: /opt/ansible/roles/installer/tasks/install.yml:97
included: /opt/ansible/roles/installer/tasks/migrate_schema.yml for localhost
TASK [installer : Check for pending migrations] ********************************
task path: /opt/ansible/roles/installer/tasks/migrate_schema.yml:3
fatal: [localhost]: FAILED! => {\"changed\": false, \"msg\": \"Failed to execute on pod awx-dev-web-c48c45544-ffqkw due to : (0)\
Reason: Handshake status 500 Internal Server Error -+-+- {'content-length': '35', 'content-type': 'text/plain; charset=utf-8', 'date': 'Tue, 21 May 2024 15:07:16 GMT'} -+-+- b'container not found ("awx-dev-web")'\
\"}
PLAY RECAP *********************************************************************
localhost : ok=71 changed=0 unreachable=0 failed=1 skipped=68 rescued=0 ignored=0
","job":"6881205681729212860","name":"awx-dev","namespace":"awx","error":"exit status 2","stacktrace":"github.com/operator-framework/ansible-operator-plugins/internal/ansible/runner.(*runner).Run.func1
\tansible-operator-plugins/internal/ansible/runner/runner.go:269"
----- Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, awx-dev/awx) -----
PLAY RECAP *********************************************************************
localhost : ok=71 changed=0 unreachable=0 failed=1 skipped=68 rescued=0 ignored=0
----------
{"level":"error","ts":"2024-05-21T15:07:16Z","msg":"Reconciler error","controller":"awx-controller","object":{"name":"awx-dev","namespace":"awx"},"namespace":"awx","name":"awx-dev","reconcileID":"76492afe-ee5f-46c4-9f6d-0f7a821852a4","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":"2024-05-21T15:07:17Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"awx-dev","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"3631807449646318833","EventData.Name":"Verify imagePullSecrets"}
AWX Operator version
2.16.1
AWX version
24.3.1
Kubernetes platform
kubernetes
Kubernetes/Platform version
1.29
Modifications
yes
Steps to reproduce
EKS cluster exists with csi driver for efs & ebs, alb.
All updated to latest.
All efs filesystems cleared of any data.
$ kubectl apply -k .
Expected results
Running AWX with web access to console.
Actual results
The deployment is stuck with awx-web CrushLoopBackOff.
$ kubectl describe pods -n awx awx-dev-web-c48c45544-ffqkw
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned awx/awx-dev-web-c48c45544-ffqkw to ip-10-167-0-76.ec2.internal
Normal Pulled 23m kubelet Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
Normal Created 23m kubelet Created container init
Normal Started 23m kubelet Started container init
Normal Pulled 23m kubelet Container image "quay.io/centos/centos:stream9" already present on machine
Normal Created 23m kubelet Created container init-projects
Normal Started 23m kubelet Started container init-projects
Normal Created 23m kubelet Created container redis
Normal Pulled 23m kubelet Container image "docker.io/redis:7" already present on machine
Normal Started 23m kubelet Started container redis
Normal Pulled 23m kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
Normal Created 23m kubelet Created container awx-dev-rsyslog
Normal Started 23m kubelet Started container awx-dev-rsyslog
Normal Created 22m (x3 over 23m) kubelet Created container awx-dev-web
Normal Started 22m (x3 over 23m) kubelet Started container awx-dev-web
Normal Pulled 21m (x4 over 23m) kubelet Container image "quay.io/ansible/awx:24.3.1" already present on machine
Warning BackOff 3m35s (x75 over 22m) kubelet Back-off restarting failed container awx-dev-web in pod awx-dev-web-c48c45544-ffqkw_awx(6bf702c0-0617-48ed-b3dc-a9adb1d2ff46)
Please confirm the following
Bug Summary
New deployment of ver 2.16.1 using kustomize on existing cluster on EKS. Same exact deployment with ver 2.10.0 works perfect!
The deployment is stuck with awx-web CrushLoopBackOff.
In operator logs I get this message:
AWX Operator version
2.16.1
AWX version
24.3.1
Kubernetes platform
kubernetes
Kubernetes/Platform version
1.29
Modifications
yes
Steps to reproduce
EKS cluster exists with csi driver for efs & ebs, alb. All updated to latest.
All efs filesystems cleared of any data.
Expected results
Running AWX with web access to console.
Actual results
The deployment is stuck with awx-web CrushLoopBackOff.
Additional information
Customized awx-ee:
Customized resources:
Operator Logs