ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
14.05k stars 3.42k forks source link

sourced script inventory failure, no retry or useful info #5942

Open dancn opened 4 years ago

dancn commented 4 years ago
ISSUE TYPE
COMPONENT NAME
SUMMARY

Update inventory fails for sourced script, without retry and without an useful message.

ENVIRONMENT
STEPS TO REPRODUCE
EXPECTED RESULTS

The inventory should be updated, if not the update should retry for some times and display an useful message about the cause of failure.

ACTUAL RESULTS
Traceback (most recent call last): File
"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py",
line 1243, in run self.pre_run_hook(self.instance, private_data_dir)
File
"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py",
line 2532, in pre_run_hook sync_task.run(local_project_sync.id) File
"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py",
line 683, in _wrapped return f(self, *args, **kwargs) File
"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py",
line 1408, in run raise AwxTaskError.TaskError(self.instance, rc)
Exception: project_update 508 (failed) encountered an error (rc=254),
please see task stdout for details.
ADDITIONAL INFORMATION

The job template and inventory are custom for a project, but I can create a dummy project to reproduce this bug. Please give me some pointer on the best practices for integrating a test case.

AlanCoding commented 4 years ago

Could you get the standard out of the project update referenced in that message? Type a modified version of this URL into your browser address bar:

https://your.awx.host.com/api/v2/project_updates/508/stdout/
dancn commented 4 years ago

Could you get the standard out of the project update referenced in that message? Type a modified version of this URL into your browser address bar:

https://your.awx.host.com/api/v2/project_updates/508/stdout/

At the url you suggest I see:

Not Found
The requested resource could not be found.

I noticed that the jobs (scm update?) that are spawned (my guess) from an "inventory update" are not listed in the UI (I see a missing job id (X+1) in the list for every "inventory update" X ). Even when logged as admin.

AlanCoding commented 4 years ago

I noticed that the jobs (scm update?) that are spawned (my guess) from an "inventory update" are not listed in the UI (I see a missing job id (X+1) in the list for every "inventory update" X ). Even when logged as admin.

That's right, that it's not shown in the JOBS list in the UI. There should be a link from the job API details, and from the job UI page too.

dancn commented 4 years ago

I saw in the setting the option "Run Project Updates With Higher Verbosity", this increase the verbosity for successful project update, but in our case the output is still empty.

A copy of to the parent https://your.awx.host.com/api/v2/project_updates/508/ Follows.

Seems that the job go in timeout before any output happens. Is there a way to debug the job start?

HTTP 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx
X-API-Time: 0.068s

{
    "id": 633,
    "type": "project_update",
    "url": "/api/v2/project_updates/633/",
    "related": {
        "credential": "/api/v2/credentials/12/",
        "unified_job_template": "/api/v2/projects/29/",
        "stdout": "/api/v2/project_updates/633/stdout/",
        "project": "/api/v2/projects/29/",
        "cancel": "/api/v2/project_updates/633/cancel/",
        "scm_inventory_updates": "/api/v2/project_updates/633/scm_inventory_updates/",
        "notifications": "/api/v2/project_updates/633/notifications/",
        "events": "/api/v2/project_updates/633/events/"
    },
    "summary_fields": {
        "project": {
            "id": 29,
            "name": "Project environments for Something_B",
            "description": "Project for environments for Something_B",
            "status": "successful",
            "scm_type": "git"
        },
        "credential": {
            "id": 12,
            "name": "SCM Credential for environments Something_B",
            "description": "",
            "kind": "scm",
            "cloud": false,
            "kubernetes": false,
            "credential_type_id": 2
        },
        "unified_job_template": {
            "id": 29,
            "name": "Project environments for Something_B",
            "description": "Project for environments for Something_B",
            "unified_job_type": "project_update"
        },
        "instance_group": {
            "id": 1,
            "name": "tower",
            "is_containerized": false
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        }
    },
    "created": "2020-02-17T14:40:05.839876Z",
    "modified": "2020-02-17T14:40:05.839910Z",
    "name": "Project environments for Something_B",
    "description": "Project for environments for Something_B",
    "local_path": "_29__project_environments_for_something_b",
    "scm_type": "git",
    "scm_url": "https://REDACTED",
    "scm_branch": "master",
    "scm_refspec": "",
    "scm_clean": false,
    "scm_delete_on_update": false,
    "credential": 12,
    "timeout": 60,
    "scm_revision": "",
    "unified_job_template": 29,
    "launch_type": "sync",
    "status": "failed",
    "failed": true,
    "started": "2020-02-17T14:40:05.839195Z",
    "finished": "2020-02-17T14:41:06.909771Z",
    "elapsed": 61.071,
    "job_args": "[\"ansible-playbook\", \"-vvv\", \"-t\", \"update_git,install_collections\", \"-i\", \"/tmp/awx_633_pfk5vi30/inventory/hosts\", \"-e\", \"@/tmp/awx_633_pfk5vi30/env/extravars\", \"project_update.yml\"]",
    "job_cwd": "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/playbooks",
    "job_env": {
        "LC_ALL": "en_US.UTF-8",
        "LANG": "en_US.UTF-8",
        "HOSTNAME": "awx",
        "PWD": "/home/awx",
        "HOME": "/var/lib/awx",
        "SHLVL": "2",
        "LANGUAGE": "en_US.UTF-8",
        "PATH": "/var/lib/awx/venv/ansible/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "_": "/usr/local/bin/supervisord",
        "SUPERVISOR_ENABLED": "1",
        "SUPERVISOR_SERVER_URL": "unix:///tmp/supervisor.sock",
        "SUPERVISOR_PROCESS_NAME": "dispatcher",
        "SUPERVISOR_GROUP_NAME": "tower-processes",
        "LC_CTYPE": "en_US.UTF-8",
        "DJANGO_SETTINGS_MODULE": "awx.settings.production",
        "DJANGO_LIVE_TEST_SERVER_ADDRESS": "localhost:9013-9199",
        "TZ": "UTC",
        "ANSIBLE_FACT_CACHE_TIMEOUT": "0",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "ANSIBLE_VENV_PATH": "/var/lib/awx/venv/ansible",
        "AWX_PRIVATE_DATA_DIR": "/tmp/awx_633_pfk5vi30",
        "VIRTUAL_ENV": "/var/lib/awx/venv/ansible",
        "PYTHONPATH": "/var/lib/awx/venv/ansible/lib/python3.6/site-packages:",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "ANSIBLE_ASK_PASS": "False",
        "ANSIBLE_BECOME_ASK_PASS": "False",
        "DISPLAY": "",
        "TMP": "/tmp",
        "PROJECT_UPDATE_ID": "633",
        "ANSIBLE_CALLBACK_PLUGINS": "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/plugins/callback",
        "ANSIBLE_GALAXY_SERVER_GALAXY_URL": "https://galaxy.ansible.com",
        "ANSIBLE_GALAXY_SERVER_LIST": "galaxy",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/tmp/awx_633_pfk5vi30/artifacts/633",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False"
    },
    "job_explanation": "Job terminated due to timeout",
    "execution_node": "awx",
    "result_traceback": "",
    "event_processing_finished": true,
    "project": 29,
    "job_type": "run",
    "job_tags": "update_git,install_collections",
    "host_status_counts": {},
    "playbook_counts": {
        "play_count": 0,
        "task_count": 0
    }
}
dancn commented 4 years ago

Additional info, the git/http server does not log anything for a failed job like this one. The real job is never started or has an early failure.

AlanCoding commented 4 years ago

Seems that the job go in timeout before any output happens. Is there a way to debug the job start?

You mean that the link there /api/v2/project_updates/633/stdout/ shows nothing?

That is concerning. Could you share the details of the SCM credential you're using? I mean, what fields are shown in inputs in /api/v2/credentials/12/? And do you have unicode in any of the values there, or anything like that?

It's probably getting into the ansible-runner context, and could be hanging trying to spawn its subprocess.

dancn commented 4 years ago

You mean that the link there /api/v2/project_updates/633/stdout/ shows nothing?

Yes.

That is concerning. Could you share the details of the SCM credential you're using? I mean, what fields are shown in inputs in /api/v2/credentials/12/? And do you have unicode in any of the values there, or anything like that?

Well I am not sure about the encoding, but both username and password are a subset of ascii, it is a token like auth.

 "credential_type": 2,
    "inputs": {
        "password": "$encrypted$",
        "username": "REDACTED"
    },

It's probably getting into the ansible-runner context, and could be hanging trying to spawn its subprocess.

Feel free to point me to any log or command.

Thanks!

sandeepsisodiya commented 3 years ago

Any Update on this? Facing the same issue in awx 15.0.1

"job_explanation": "Job terminated due to timeout",

Project Update seems to be working fine: GET /api/v2/project_updates/1/stdout/ HTTP 200 OK Allow: GET, HEAD, OPTIONS Content-Type: text/plain ;utf-8 Vary: Accept X-API-Node: awx X-API-Product-Name: AWX X-API-Product-Version: 15.0.1 X-API-Time: 0.020s

PLAY [Update source tree if necessary] *****

TASK [update project using git] **** changed: [localhost]

TASK [Set the git repository version] ** ok: [localhost]

TASK [Repository Version] ** ok: [localhost] => { "msg": "Repository Version 347e44fea036c94d5f60e544de006453ee5c71ad" }

PLAY [Install content with ansible-galaxy command if necessary] ****

TASK [detect roles/requirements.(yml/yaml)] **** ok: [localhost] => (item={'ext': '.yml'}) ok: [localhost] => (item={'ext': '.yaml'})

TASK [fetch galaxy roles from requirements.(yml/yaml)] ***** skipping: [localhost] => (item={'changed': False, 'stat': {'exists': False}, 'invocation': {'module_args': {'path': '/var/lib/awx/projects/_6demo_project/roles/requirements.yml', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': {'ext': '.yml'}, 'ansible_loop_var': 'item'}) skipping: [localhost] => (item={'changed': False, 'stat': {'exists': False}, 'invocation': {'module_args': {'path': '/var/lib/awx/projects/_6demo_project/roles/requirements.yaml', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': {'ext': '.yaml'}, 'ansible_loop_var': 'item'})

TASK [detect collections/requirements.(yml/yaml)] ** ok: [localhost] => (item={'ext': '.yml'}) ok: [localhost] => (item={'ext': '.yaml'})

TASK [fetch galaxy collections from collections/requirements.(yml/yaml)] *** skipping: [localhost] => (item={'changed': False, 'stat': {'exists': False}, 'invocation': {'module_args': {'path': '/var/lib/awx/projects/_6demo_project/collections/requirements.yml', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': {'ext': '.yml'}, 'ansible_loop_var': 'item'}) skipping: [localhost] => (item={'changed': False, 'stat': {'exists': False}, 'invocation': {'module_args': {'path': '/var/lib/awx/projects/_6demo_project/collections/requirements.yaml', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': {'ext': '.yaml'}, 'ansible_loop_var': 'item'})

PLAY RECAP ***** localhost : ok=5 changed=1 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

AlanCoding commented 3 years ago

@sandeepsisodiya is that the same as https://github.com/ansible/awx/issues/8977?