Open Arjunasvr opened 1 week ago
to confirm, you have the operator running within the same namespace right?
to confirm, you have the operator running within the same namespace right?
Yes it is.
@Arjunasvr Could you share events for the pod related to awx-task-XXXXXXXX
kubectl -n awx describe pod awx-task-XXXXXXXX
I suspect your pvc is pointing to the un-shareable volume and getting deleted.
Hey @Arjunasvr ,
we encountered an issue that could help you. In our scenario the configs (crds) werent updated and the 'web_manage_replicas' was undefined. There are logs within the operator while upgrading where you can find this error.
TASK [Apply deployment resources] ******************************** fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined\n\nThe error appears to be in '/opt/ansible/roles/installer/tasks/resources_configuration.yml': line 248, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Apply deployment resources\n ^ here\n"}
After we executed:
kubectl apply --server-side -k "github.com/ansible/awx-operator/config/crd?ref=2.19.0"
The migration started right away.
See also:
https://github.com/ansible/awx-operator/commit/8ead140541622f67bd2d44a3c76bb05739cdebb6#diff-8230d07440a5d33c9608211b63791ef41f935652ca8b8ec3d9f3c68b5ed8cc98
@Arjunasvr Could you share events for the pod related to awx-task-XXXXXXXX
kubectl -n awx describe pod awx-task-XXXXXXXX
I suspect your pvc is pointing to the un-shareable volume and getting deleted.
I am sorry I cant do this because there is no awx-task pod
Hey @Arjunasvr , we encountered an issue that could help you. In our scenario the configs (crds) werent updated and the 'web_manage_replicas' was undefined. There are logs within the operator while upgrading where you can find this error.
TASK [Apply deployment resources] ******************************** fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined\n\nThe error appears to be in '/opt/ansible/roles/installer/tasks/resources_configuration.yml': line 248, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Apply deployment resources\n ^ here\n"}
After we executed: kubectl apply --server-side -k "github.com/ansible/awx-operator/config/crd?ref=2.19.0" The migration started right away. See also: ansible/awx-operator@8ead140#diff-8230d07440a5d33c9608211b63791ef41f935652ca8b8ec3d9f3c68b5ed8cc98
Hi I tried this and it didnt work I checked some logging from the operator pod and saw this error:
5788921982606687203","name":"awx-server","namespace":"awx","error":"exit status 2","stacktrace":"github.com/operator-framework/ansible-operator-plugins/internal/ansible/runner.(*runner).Run.func1\n\tansible-operator-plugins/internal/ansible/runner/runner.go:269"}
And also I saw this:
ASK [installer : Stream backup from pg_dump to the new postgresql container] *** task path: /opt/ansible/roles/installer/tasks/upgrade_postgres.yml:99
{"level":"info","ts":"2024-07-02T06:55:23Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"awx-server","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"231178893729865755","EventData.Name":"installer : Stream backup from pg_dump to the new postgresql container"} {"level":"info","ts":"2024-07-02T06:55:23Z","logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/api/v1/namespaces/awx/pods/awx-server-postgres-15-0","Verb":"get","APIPrefix":"api","APIGroup":"","APIVersion":"v1","Namespace":"awx","Resource":"pods","Subresource":"","Name":"awx-server-postgres-15-0","Parts":["pods","awx-server-postgres-15-0"]}}
--------------------------- Ansible Task StdOut -------------------------------
TASK [Stream backup from pg_dump to the new postgresql container] **** fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": true}
Someone a new idea?
@Arjunasvr Could you share events for the pod related to awx-task-XXXXXXXX kubectl -n awx describe pod awx-task-XXXXXXXX I suspect your pvc is pointing to the un-shareable volume and getting deleted.
I am sorry I cant do this because there is no awx-task pod
was it on Minikube ? or limited hardware setup ?
I can tell usual behavior, even if it's normal (not minimal) hardware with k8s, it usually takes between 40-60 minutes for the aws-task-XXXXXXXXX pods to appear. feel free to try on another hardware. good luck
@Arjunasvr Could you share events for the pod related to awx-task-XXXXXXXX kubectl -n awx describe pod awx-task-XXXXXXXX I suspect your pvc is pointing to the un-shareable volume and getting deleted.
I am sorry I cant do this because there is no awx-task pod
was it on Minikube ? or limited hardware setup ?
I can tell usual behavior, even if it's normal (not minimal) hardware with k8s, it usually takes between 40-60 minutes for the aws-task-XXXXXXXXX pods to appear. feel free to try on another hardware. good luck
It was on minikube indeed. Normally the awx-task-xxx pod spins up in 5/10 minutes. I even had the upgrade on more than 2 days and even then the task and web wouldnt show when I execute kubectl get pods -n awx
@Arjunasvr can you set no_log: False in your awx spec? that way the operator shows more details of what is failing.
@Arjunasvr can you set no_log: False in your awx spec? that way the operator shows more details of what is failing.
Hi @fosterseth I did, no change in the pod log getting still the same errors
Please confirm the following
security@ansible.com
instead.)Bug Summary
I tried upgrading to version 2.19.0, but the task en web pods doesnt exist anymore. I cannot access the web anymore. In minikube I cannot see that the pods are running. They just vanished. Also when I try to downgrade to 2.12.0 the task container doesnt work anymore. Can someone pls assist me in getting awx up and running again.
AWX version
operator 2.19.0
Select the relevant components
Installation method
minikube
Modifications
no
Ansible version
No response
Operating system
ubuntu 22.04 lts
Web browser
Firefox, Chrome, Safari, Edge
Steps to reproduce
upgrade to awx 2.19.0 and wait
Expected results
Awx UI will be shown and container such as the task and web are running
Actual results
The task and web container is not running and not showing in the namespace for the pods.
Additional information
No response