Open mick1627 opened 2 years ago
We think this bug was fixed in receptor 1.2.3, please make sure this receptor version is present in your awx ee image
We think this bug was fixed in receptor 1.2.3, please make sure this receptor version is present in your awx ee image
I have quay.io/ansible/receptor:devel in my ee repo, and the receptor version is 1.2.0+gce6c620. I will update it to use quay.io/ansible/receptor:latest to have version 1.2.3.
I'm closing this issue and reopen it if I still have this issue with receptor 1.2.3.
I'm reopening this issue, I still have it after the update of AWX and our custom EE :
I'm using AWX 21.1.0 Custom EE with receptor 1.2.3.
receptor --version
1.2.3
Did you run the above command inside of the control plane EE or inside of a pod for a running job?
I am using the same docker image for the control plane EE and the pod where the job is running
Can you share any information about the workload? How many hosts, how many tasks, how much stdout, are you gathering a lot of facts? Anything you can share might be helpful.
It happens randomly in different kind of job template. For the last issue, it happens on a job part of a workflow. The job was launch on 1 host, about 15 tasks and no gather facts.
This could be https://github.com/ansible/ansible-runner/issues/998
This might have been fixed with https://github.com/ansible/ansible-runner/pull/1161
We still see this error quite a lot on our setup.
2023-11-30 12:37:28,292 ERROR [fa7348566d2e4873932a160255a1a0cd] awx.main.tasks.jobs job 161809 (running) Exception occurred while running task
Traceback (most recent call last):
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/jobs.py", line 605, in run
res = receptor_job.run()
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 318, in run
res = self._run_internal(receptor_ctl)
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 393, in _run_internal
res = processor_future.result()
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/utils/common.py", line 1199, in wrapper_cleanup_new_process
return func(*args, **kwargs)
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/awx/main/tasks/receptor.py", line 461, in processor
return ansible_runner.interface.run(
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/ansible_runner/interface.py", line 213, in run
r.run()
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/ansible_runner/streaming.py", line 361, in run
self.artifacts_callback(data)
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/ansible_runner/streaming.py", line 335, in artifacts_callback
unstream_dir(self._input, length, self.artifact_dir)
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/ansible_runner/utils/streaming.py", line 78, in unstream_dir
with zipfile.ZipFile(tmp.name, "r") as archive:
File "/usr/lib64/python3.9/zipfile.py", line 1266, in __init__
self._RealGetContents()
File "/usr/lib64/python3.9/zipfile.py", line 1333, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
ansible-runner --version 2.3.3
The code from https://github.com/ansible/ansible-runner/pull/1162/files is there, but we still get this error.
same problem came up in our setup: AWX 23.5.1 ansible_runner 2.3.4 k3s: v1.28.4+k3s2
Is there any suggestion to fix this issue, because we get this quite often:
/jobs/174889 2024-02-07T05:10:17.720127Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174888 2024-02-07T05:10:17.631557Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174887 2024-02-07T05:00:27.010690Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174879 2024-02-07T04:40:26.852807Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174872 2024-02-07T04:20:26.397516Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174871 2024-02-07T04:20:26.111226Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174867 2024-02-07T04:00:26.542494Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174864 2024-02-07T03:55:41.692424Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174857 2024-02-07T03:20:26.090661Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174856 2024-02-07T03:20:25.799524Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174854 2024-02-07T03:00:26.420650Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174851 2024-02-07T02:59:55.172001Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174847 2024-02-07T02:40:25.808439Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174845 2024-02-07T02:20:26.445563Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174841 2024-02-07T02:00:27.005234Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174840 2024-02-07T02:00:26.750985Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174839 2024-02-07T02:00:26.401924Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174802 2024-02-06T14:08:10.237412Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174778 2024-02-06T12:57:13.975563Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174745 2024-02-06T11:32:42.871325Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174665 2024-02-06T08:26:04.317739Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174653 2024-02-06T08:04:31.230036Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174619 2024-02-06T08:04:16.185793Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174585 2024-02-06T05:00:26.304256Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174563 2024-02-06T03:00:27.215729Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174559 2024-02-06T02:40:26.593133Z error (`zipfile.BadZipFile: File is not a zip file`)
/jobs/174552 2024-02-06T02:00:34.138571Z error (`zipfile.BadZipFile: File is not a zip file`)
Same issue on latest version -
BadZipFile("File is not a zip file") - on random job runs. We run the same job on 300+ hosts and it may occur once in 300 times...
Any devs got time to check this?
@blundey
usually this relates to issues around log rotation
you can bump up your max log container size for your cluster
you should also enable reconnect
Please confirm the following
Summary
From time to time some jobs failed with the following error :
The job has the ID 818890, here the logs os awx_task pod
In the UI the last modified is 11:00:20 AM (CET) or 9:00:20 UTC
but we can see in the logs of awx_task that the job finished at 09:01:07.
AWX version
19.5.0
Select the relevant components
Installation method
kubernetes
Modifications
yes
Ansible version
2.12
Operating system
No response
Web browser
No response
Steps to reproduce
Launch job template in AWX, from time to time it failed with this zipfile.BadZipFile: File is not a zip file error
Expected results
Job launch correctly
Actual results
job failed with zipfile.BadZipFile error
Additional information
WE use custom awx EE.