Project Sync has started failing on default Control Plane Execution Environment after reapplying the AWX-operator and AWX CRD in the cluster. However, certain tasks, such as ping or win_ping, are still functioning correctly #1882
[X] I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.
Bug Summary
Recently I reapplied AWX-operator and AWX CRD in the K3s cluster (without any config changes), after which Project Sync has started failing on Control Plane Execution Environment, However, certain other tasks like ping or win_ping, are functioning correctly. This setup was working just fine since last few months.
For example - Whenever I run AWX Project sync or Demo project sync, it fails with following error:
"module_stdout": "",
"module_stderr": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 137,
"_ansible_no_log": false,
"changed": false
}
Following is the complete output response for project sync job:
Login to AWX web UI with admin user and run Demo Project sync (which will by default run on Control Plane Execution Environment as can be seen in job details)
Expected results
Demo Project sync job runs with success in AWX
Actual results
Demo Project sync job failed in AWX, with following results:
Job execution result
PLAY [Update source tree if necessary] *****************************************
TASK [Update project using git] ************************************************
task path: /tmp/awx_207403_ytruvr2q/project/project_update.yml:41
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: 1000
<127.0.0.1> EXEC /bin/sh -c 'echo ~1000 && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /runner/.ansible/tmp `"&& mkdir "` echo /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406 `" && echo ansible-tmp-1716955683.6936572-188-118617953611406="` echo /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406 `" ) && sleep 0'
Using module file /usr/local/lib/python3.9/site-packages/ansible/modules/git.py
<127.0.0.1> PUT /runner/.ansible/tmp/ansible-local-1849x1mepar/tmpyl6_jvm_ TO /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406/AnsiballZ_git.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406/ /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406/AnsiballZ_git.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3 /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406/AnsiballZ_git.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /runner/.ansible/tmp/ansible-tmp-1716955683.6936572-188-118617953611406/ > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
"changed": false,
"module_stderr": "",
"module_stdout": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 137
}
PLAY RECAP *********************************************************************
localhost : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0`
**Job execution result when Clean and Delete options are enabled in the Project**
`fatal: [localhost]: FAILED! => {
"changed": false,
"changed_when_result": "The conditional check 'reg.stdout_lines | length > 1' failed. The error was: error while evaluating conditional (reg.stdout_lines | length > 1): 'dict object' has no attribute 'stdout_lines'. 'dict object' has no attribute 'stdout_lines'",
"module_stderr": "",
"module_stdout": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 137
}
Additional information
No response
Operator Logs
AWX task pod logs
2024-05-29 05:50:44,905 DEBUG [f45900d372a74811b3839991111a3484] awx.main.scheduler Finished dependency_manager Scheduler, timing data:
{'get_tasks_seconds': 0.03475628200249048, 'generate_dependencies_seconds': 0, '_schedule_seconds': 0.034768084005918354, '_schedule_calls': 0, 'recorded_timestamp': 0, 'pending_processed': 0}
2024-05-29 05:50:45,809 DEBUG [-] awx.main.wsrelay Web host abc-awx-web-5996c54f9b-bk8kh (10.42.1.9) online heartbeat received.
2024-05-29 05:50:45,812 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic scheduler found k8s_reaper to run, 0.003035306930541992 seconds after target
2024-05-29 05:50:45,813 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic Scheduler next run is receptor_reaper in 1.9963040351867676 seconds
2024-05-29 05:50:45,814 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch task 28e72180-d9fb-435c-8398-c15a162980ea starting awx.main.tasks.system.awx_k8s_reaper(*[])
2024-05-29 05:50:45,855 DEBUG [f45900d372a74811b3839991111a3484] awx.main.tasks.system Checking for orphaned k8s pods for default-3.
2024-05-29 05:50:45,914 DEBUG [f45900d372a74811b3839991111a3484] awx.main.tasks.system Checking for orphaned k8s pods for intelligenibots-askml-4.
2024-05-29 05:50:47,820 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic scheduler found receptor_reaper to run, 0.011101245880126953 seconds after target
2024-05-29 05:50:47,821 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic Scheduler next run is send_subsystem_metrics in 0.987910270690918 seconds
2024-05-29 05:50:47,823 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch task 5c6ba994-14c1-41fe-b834-f04606c0756c starting awx.main.tasks.system.awx_receptor_workunit_reaper(*[])
2024-05-29 05:50:47,825 DEBUG [f45900d372a74811b3839991111a3484] awx.main.tasks.system Checking for unreleased receptor work units
2024-05-29 05:50:48,815 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic scheduler found send_subsystem_metrics to run, 0.0057506561279296875 seconds after target
2024-05-29 05:50:48,815 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch.periodic Scheduler next run is pool_cleanup in 5.993627071380615 seconds
2024-05-29 05:50:48,817 DEBUG [f45900d372a74811b3839991111a3484] awx.main.dispatch task 9f1930ad-3501-40fe-8a79-ee80670a6444 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(*[])
2024-05-29 05:50:50,296 INFO [f45900d372a74811b3839991111a3484] awx.main.commands.run_callback_receiver Starting EOF event processing for Job 207512
2024-05-29 05:50:50,303 DEBUG [f45900d372a74811b3839991111a3484] awx.main.tasks.jobs project_update 207512 (running) finished running, producing 41 events.
2024-05-29 05:50:50,306 INFO [f45900d372a74811b3839991111a3484] awx.analytics.job_lifecycle projectupdate-207512 post run {"type": "projectupdate", "task_id": 207512, "state": "post_run", "work_unit_id": "4aYaAoAG", "task_name": "Microbots"}
2024-05-29 05:50:50,533 INFO [f45900d372a74811b3839991111a3484] awx.analytics.job_lifecycle projectupdate-207512 finalize run {"type": "projectupdate", "task_id": 207512, "state": "finalize_run", "work_unit_id": "4aYaAoAG", "task_name": "Microbots"}
2024-05-29 05:50:50,541 WARNING [f45900d372a74811b3839991111a3484] awx.main.dispatch project_update 207512 (failed) encountered an error (rc=None), please see task stdout for details.
2024-05-29 05:50:51,306 INFO [-] awx.analytics.job_lifecycle projectupdate-207512 stats wrapup finished {"type": "projectupdate", "task_id": 207512, "state": "stats_wrapup_finished", "work_unit_id": "4aYaAoAG", "task_name": "Microbots"}
awx-operator pod logs
TASK [Remove ownerReferences reference] ********************************
ok: [localhost] => (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}
-------------------------------------------------------------------------------
{"level":"info","ts":"2024-05-29T03:06:23Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"abc-awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"2090315677743390150","EventData.Name":"installer : Start installation if auto_upgrade is false and deployment is missing"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [installer : Start installation if auto_upgrade is false and deployment is missing] ***
task path: /opt/ansible/roles/installer/tasks/main.yml:31
-------------------------------------------------------------------------------
{"level":"info","ts":"2024-05-29T03:06:23Z","logger":"runner","msg":"Ansible-runner exited successfully","job":"2090315677743390150","name":"abc-awx","namespace":"awx"}
----- Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, abc-awx/awx) -----
PLAY RECAP *********************************************************************
localhost : ok=89 changed=0 unreachable=0 failed=0 skipped=83 rescued=0 ignored=1
Please confirm the following
Bug Summary
Recently I reapplied AWX-operator and AWX CRD in the K3s cluster (without any config changes), after which Project Sync has started failing on
Control Plane Execution Environment
, However, certain other tasks like ping or win_ping, are functioning correctly. This setup was working just fine since last few months.For example - Whenever I run AWX Project sync or Demo project sync, it fails with following error:
Following is the complete output response for project sync job:
Following is the project sync error response when running Project sync with Clean and Delete options:
AWX Operator version
2.15.0
AWX version
24.2.0
Kubernetes platform
kubernetes
Kubernetes/Platform version
k3s version v1.25.5+k3s1 (48e5d2af)
Modifications
no
Steps to reproduce
Installation is done with kustomization.yaml file with external/unmanaged postgres database as follows:
awx-deploy.yaml manifest
Demo Project
sync (which will by default run onControl Plane Execution Environment
as can be seen in job details)Expected results
Demo Project
sync job runs withsuccess
in AWXActual results
Demo Project
sync jobfailed
in AWX, with following results:Job execution result
Additional information
No response
Operator Logs
AWX task pod logs
awx-operator pod logs