ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.94k stars 3.41k forks source link

Job run fails with traceback if project update fails #6199

Open anxstj opened 4 years ago

anxstj commented 4 years ago
ISSUE TYPE
SUMMARY

If a Job fails to update its project (or project dependencies) then the error details will show a traceback as error message.

grafik

ENVIRONMENT
STEPS TO REPRODUCE
EXPECTED RESULTS

The job should fail with a meaningful error message. (e.g. "The project update has failed. See job id 12345 for details.")

ACTUAL RESULTS

The job fails with a stack trace.

Traceback (most recent call last):\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1275, in run\n    self.pre_run_hook(self.instance, private_data_dir)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1858, in pre_run_hook\n    sync_task.run(local_project_sync.id)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 698, in _wrapped\n    return f(self, *args, **kwargs)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1440, in run\n    raise AwxTaskError.TaskError(self.instance, rc)\nException: project_update 29219 (failed) encountered an error (rc=2), please see task stdout for details.\n
ADDITIONAL INFORMATION

api/v2/jobs/29218/:

{
    "id": 29218,
    "type": "job",
    "url": "/api/v2/jobs/29218/",
    "related": {
        "created_by": "/api/v2/users/2/",
        "labels": "/api/v2/jobs/29218/labels/",
        "inventory": "/api/v2/inventories/223/",
        "project": "/api/v2/projects/560/",
        "extra_credentials": "/api/v2/jobs/29218/extra_credentials/",
        "credentials": "/api/v2/jobs/29218/credentials/",
        "unified_job_template": "/api/v2/job_templates/566/",
        "stdout": "/api/v2/jobs/29218/stdout/",
        "job_events": "/api/v2/jobs/29218/job_events/",
        "job_host_summaries": "/api/v2/jobs/29218/job_host_summaries/",
        "activity_stream": "/api/v2/jobs/29218/activity_stream/",
        "notifications": "/api/v2/jobs/29218/notifications/",
        "create_schedule": "/api/v2/jobs/29218/create_schedule/",
        "job_template": "/api/v2/job_templates/566/",
        "cancel": "/api/v2/jobs/29218/cancel/",
        "project_update": "/api/v2/project_updates/29219/",
        "relaunch": "/api/v2/jobs/29218/relaunch/"
    },
    "summary_fields": {
        "inventory": {
            "id": 223,
            "name": "nventory",
            "description": "",
            "has_active_failures": true,
            "total_hosts": 12,
            "hosts_with_active_failures": 2,
            "total_groups": 9,
            "has_inventory_sources": true,
            "total_inventory_sources": 1,
            "inventory_sources_with_failures": 0,
            "organization_id": 32,
            "kind": ""
        },
        "project": {
            "id": 560,
            "name": "website",
            "description": "",
            "status": "successful",
            "scm_type": "git"
        },
        "project_update": {
            "id": 29219,
            "name": "website",
            "description": "",
            "status": "failed",
            "failed": true
        },
        "job_template": {
            "id": 566,
            "name": "Playbook: test.yml",
            "description": ""
        },
        "unified_job_template": {
            "id": 566,
            "name": "Playbook: test.yml",
            "description": "",
            "unified_job_type": "job"
        },
        "instance_group": {
            "id": 1,
            "name": "tower",
            "is_containerized": false
        },
        "created_by": {
            "id": 2,
            "username": "foobar",
            "first_name": "foo",
            "last_name": "bar"
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        },
        "labels": {
            "count": 0,
            "results": []
        },
        "extra_credentials": [],
        "credentials": [
            {
                "id": 86,
                "name": "root",
                "description": "host access after bootstrapping",
                "kind": "ssh",
                "cloud": false
            }
        ]
    },
    "created": "2020-03-06T08:58:10.932797Z",
    "modified": "2020-03-06T08:58:11.215703Z",
    "name": "Playbook: test.yml",
    "description": "",
    "job_type": "run",
    "inventory": 223,
    "project": 560,
    "playbook": "test.yml",
    "scm_branch": "",
    "forks": 0,
    "limit": "",
    "verbosity": 0,
    "extra_vars": "{}",
    "job_tags": "",
    "force_handlers": false,
    "skip_tags": "",
    "start_at_task": "",
    "timeout": 0,
    "use_fact_cache": false,
    "unified_job_template": 566,
    "launch_type": "relaunch",
    "status": "error",
    "failed": true,
    "started": "2020-03-06T08:58:11.324897Z",
    "finished": "2020-03-06T08:58:16.725877Z",
    "canceled_on": null,
    "elapsed": 0.0,
    "job_args": "",
    "job_cwd": "",
    "job_env": {},
    "job_explanation": "Previous Task Failed: {\"job_type\": \"project_update\", \"job_name\": \"website\", \"job_id\": \"29219\"}",
    "execution_node": "awx",
    "controller_node": "",
    "result_traceback": "Traceback (most recent call last):\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1275, in run\n    self.pre_run_hook(self.instance, private_data_dir)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1858, in pre_run_hook\n    sync_task.run(local_project_sync.id)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 698, in _wrapped\n    return f(self, *args, **kwargs)\n  File \"/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/tasks.py\", line 1440, in run\n    raise AwxTaskError.TaskError(self.instance, rc)\nException: project_update 29219 (failed) encountered an error (rc=2), please see task stdout for details.\n",
    "event_processing_finished": true,
    "job_template": 566,
    "passwords_needed_to_start": [],
    "allow_simultaneous": false,
    "artifacts": {},
    "scm_revision": "",
    "instance_group": 1,
    "diff_mode": false,
    "job_slice_number": 0,
    "job_slice_count": 1,
    "webhook_service": "",
    "webhook_credential": null,
    "webhook_guid": "",
    "host_status_counts": {},
    "playbook_counts": {
        "play_count": 0,
        "task_count": 0
    },
    "custom_virtualenv": null
}

api/v2/project_updates/29219/:

{
    "id": 29219,
    "type": "project_update",
    "url": "/api/v2/project_updates/29219/",
    "related": {
        "credential": "/api/v2/credentials/122/",
        "unified_job_template": "/api/v2/projects/560/",
        "stdout": "/api/v2/project_updates/29219/stdout/",
        "project": "/api/v2/projects/560/",
        "cancel": "/api/v2/project_updates/29219/cancel/",
        "scm_inventory_updates": "/api/v2/project_updates/29219/scm_inventory_updates/",
        "notifications": "/api/v2/project_updates/29219/notifications/",
        "events": "/api/v2/project_updates/29219/events/"
    },
    "summary_fields": {
        "project": {
            "id": 560,
            "name": "website",
            "description": "",
            "status": "successful",
            "scm_type": "git"
        },
        "credential": {
            "id": 122,
            "name": "website deploy key (ro)",
            "description": "",
            "kind": "scm",
            "cloud": false,
            "kubernetes": false,
            "credential_type_id": 2
        },
        "unified_job_template": {
            "id": 560,
            "name": "website",
            "description": "",
            "unified_job_type": "project_update"
        },
        "instance_group": {
            "id": 1,
            "name": "tower",
            "is_containerized": false
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        }
    },
    "created": "2020-03-06T08:58:11.456097Z",
    "modified": "2020-03-06T08:58:11.456111Z",
    "name": "website",
    "description": "",
    "local_path": "_560__ansible",
    "scm_type": "git",
    "scm_url": "...website.git",
    "scm_branch": "",
    "scm_refspec": "",
    "scm_clean": true,
    "scm_delete_on_update": true,
    "credential": 122,
    "timeout": 0,
    "scm_revision": "e138ee5b70aece5c25d5a277e5528044fe8b68cc",
    "unified_job_template": 560,
    "launch_type": "sync",
    "status": "failed",
    "failed": true,
    "started": "2020-03-06T08:58:11.455444Z",
    "finished": "2020-03-06T08:58:16.690922Z",
    "canceled_on": null,
    "elapsed": 5.235,
    "job_args": "[\"ansible-playbook\", \"-t\", \"install_roles\", \"-i\", \"/tmp/awx_29219_jlk1b4l9/inventory/hosts\", \"-e\", \"@/tmp/awx_29219_jlk1b4l9/env/extravars\", \"project_update.yml\"]",
    "job_cwd": "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/playbooks",
    "job_env": {
        "LC_ALL": "en_US.UTF-8",
        "LANG": "en_US.UTF-8",
        "HOSTNAME": "awx-task-a8n",
        "PWD": "/home/awx",
        "HOME": "/var/lib/awx",
        "SHLVL": "2",
        "LANGUAGE": "en_US.UTF-8",
        "PATH": "/var/lib/awx/venv/ansible/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "_": "/usr/local/bin/supervisord",
        "SUPERVISOR_ENABLED": "1",
        "SUPERVISOR_SERVER_URL": "unix:///tmp/supervisor.sock",
        "SUPERVISOR_PROCESS_NAME": "dispatcher",
        "SUPERVISOR_GROUP_NAME": "tower-processes",
        "LC_CTYPE": "en_US.UTF-8",
        "DJANGO_SETTINGS_MODULE": "awx.settings.production",
        "DJANGO_LIVE_TEST_SERVER_ADDRESS": "localhost:9013-9199",
        "TZ": "UTC",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "ANSIBLE_VENV_PATH": "/var/lib/awx/venv/ansible",
        "AWX_PRIVATE_DATA_DIR": "/tmp/awx_29219_jlk1b4l9",
        "VIRTUAL_ENV": "/var/lib/awx/venv/ansible",
        "PYTHONPATH": "/var/lib/awx/venv/ansible/lib/python3.6/site-packages:",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "ANSIBLE_ASK_PASS": "False",
        "ANSIBLE_BECOME_ASK_PASS": "False",
        "DISPLAY": "",
        "TMP": "/tmp",
        "PROJECT_UPDATE_ID": "29219",
        "ANSIBLE_CALLBACK_PLUGINS": "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/plugins/callback",
        "ANSIBLE_GALAXY_SERVER_GALAXY_URL": "https://galaxy.ansible.com",
        "ANSIBLE_GALAXY_SERVER_LIST": "galaxy",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/tmp/awx_29219_jlk1b4l9/artifacts/29219",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False"
    },
    "job_explanation": "",
    "execution_node": "awx",
    "result_traceback": "",
    "event_processing_finished": true,
    "project": 560,
    "job_type": "run",
    "job_tags": "install_roles",
    "host_status_counts": {
        "failures": 1
    },
    "playbook_counts": {
        "play_count": 2,
        "task_count": 2
    }
}

api/v2/project_updates/29219/stdout/?format=txt:

PLAY [Update source tree if necessary] *****************************************

PLAY [Install content with ansible-galaxy command if necessary] ****************

TASK [detect requirements.yml] *************************************************
ok: [localhost]

TASK [fetch galaxy roles from requirements.yml] ********************************
fatal: [localhost]: FAILED! => {"changed": false, "cmd": ["ansible-galaxy", "install", "-r", "requirements.yml", "-p", "/tmp/awx_29218_3nyoftwk/requirements_roles"], "delta": "0:00:00.634267", "end": "2020-03-06 08:58:16.252183", "msg": "non-zero return code", "rc": 1, "start": "2020-03-06 08:58:15.617916", "stderr": "[WARNING]: - myrole was NOT installed successfully: -\\ncommand /usr/bin/git clone git@example.com:myrole.git myrole failed in\\ndirectory /var/lib/awx/.ansible/tmp/ansible-local-16920s4fktw85/tmp_q2x71sw\\n(rc=128)\\nERROR! - you can use --ignore-errors to skip failed roles and finish processing the list.", "stderr_lines": ["[WARNING]: - myrole was NOT installed successfully: -", "command /usr/bin/git clone git@example.com:myrole.git myrole failed in", "directory /var/lib/awx/.ansible/tmp/ansible-local-16920s4fktw85/tmp_q2x71sw", "(rc=128)", "ERROR! - you can use --ignore-errors to skip failed roles and finish processing the list."], "stdout": "", "stdout_lines": []}

PLAY RECAP *********************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0 

api/v2/projects/560/:

{
    "id": 560,
    "type": "project",
    "url": "/api/v2/projects/560/",
    [...]
    "name": "website",
    "description": "",
    "local_path": "_560__ansible",
    "scm_type": "git",
    "scm_url": "git@example.com:website.git",
    "scm_branch": "",
    "scm_refspec": "",
    "scm_clean": true,
    "scm_delete_on_update": true,
    "credential": 109,
    "timeout": 0,
    "scm_revision": "cb76d3c3009f1240bf8ca607bcbea7e019ca6fcf",
    "last_job_run": "2020-03-06T09:09:15.029262Z",
    "last_job_failed": false,
    "next_job_run": null,
    "status": "successful",
    "organization": 32,
    "scm_update_on_launch": false,
    "scm_update_cache_timeout": 0,
    "allow_override": true,
    "custom_virtualenv": "",
    "last_update_failed": false,
    "last_updated": "2020-03-06T09:09:15.029262Z"
}
Juludut commented 4 years ago

Hello, same problem here, jobs failed with a traceback and no notification, because of a requirements.yml containing a repository which is not accessible. (thanks for the bug report, btw, which helped us find the issue).

AlanCoding commented 2 years ago

I am going to submit this as a technical improvement item to be escalated, because our exception handling in tasks could strongly benefit from some more eyes on it.

We have a small set of internal exception types, but we're apparently not using them right. In this case, we should be specifically catching this AwxTaskError. It doesn't make any sense to surface that in a traceback, because the point in code where the exception is raised is merely passing on the information that the dependency failed. What the user should be interested in is:

debianchim27 commented 2 years ago

Hi all,

this is currently true, and I find a workaround thanks to this issue

Just go to /api/v2/jobs/YOURJOBNUMBER/ and follow the link at "project_updates" key

Under project_updates, follow "stdout" key and you will find the root cause.