ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
14.05k stars 3.42k forks source link

Job launch fails with File Exists on git_repo.create_head() #7809

Closed mkayontour closed 4 years ago

mkayontour commented 4 years ago
ISSUE TYPE
SUMMARY

We are getting random errors on starting jobs, mostly when they are started in Workflows.

Traceback (most recent call last):
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py", line 1355, in run self.pre_run_hook(self.instance, private_data_dir)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py", line 1959, in pre_run_hook job.project.scm_type, job_revision
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py", line 2340, in make_local_copy source_branch = git_repo.create_head(tmp_branch_name, scm_revision)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/git/repo/base.py", line 386, in create_head return Head.create(self, path, commit, force, logmsg)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/git/refs/symbolic.py", line 543, in create return cls._create(repo, path, cls._resolve_ref_on_create, reference, force, logmsg)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/git/refs/symbolic.py", line 510, in _create ref.set_reference(target, logmsg)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/git/refs/symbolic.py", line 326, in set_reference assure_directory_exists(fpath, is_file=True)
 File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/git/util.py", line 177, in assure_directory_exists os.makedirs(path)
 File "/var/lib/awx/venv/awx/lib64/python3.6/os.py", line 220, in makedirs mkdir(name, mode) 

FileExistsError: [Errno 17] File exists: '/var/lib/awx/projects/_41__basis_git/.git/refs/heads/awx_internal'

We don't make any changes locally, also the same job templates works just a minute before. On this jobs are concurrent jobs allowed, this is the only similarity I can spot besides the standard options.

ENVIRONMENT
STEPS TO REPRODUCE

Can't reproduce it occurs randomly

wenottingham commented 4 years ago

Likely a git-python bug, but some questions:

mkayontour commented 4 years ago

Yes it's standard git project, without requirements.yml in the roles directory. The roles are hardcoded into the roles folder. No submodules are used and no tags.

We are using a few branches, but only for users to test some new features. The branch to run on is the master branch.

mkayontour commented 4 years ago

I've updated the containers to Version 13.0.0 and the issue still exists. It happens regularly but mostly unexpected. Sometimes the template will fail, sometimes not.

ryanpetrello commented 4 years ago

It might be worth trying this out after we upgrade git-python; it's possible it's the result of a known issue that got fixed:

https://github.com/ansible/awx/pull/7860

ryanpetrello commented 4 years ago

Actually I'm almost certain this is a duplicate of https://github.com/ansible/awx/issues/6119; what do you think @AlanCoding ?

AlanCoding commented 4 years ago

A FileExistsError from a makedirs call in assure_directory_exists? Yeah, that's way too specific to be anything else.

devsadds commented 3 years ago

Error happen if chose http proto(not git) in awx project.