ansible / ansible

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.
https://www.ansible.com/
GNU General Public License v3.0
62.71k stars 23.87k forks source link

callback: An include_role task failure triggers a v2_runner_on_ok hook instead of v2_runner_on_failed #77336

Closed dmsimard closed 9 months ago

dmsimard commented 2 years ago

Summary

When include_role fails -- for example, when it can't find the role it is meant to include -- it does not seem to be considered a failure. The playbook stats summary shows failed=0 and the task fires v2_runner_on_ok instead of v2_runner_on_failed.

The ansible-playbook command exits with exit code 2, correctly detecting the failure.

Issue Type

Bug Report

Component Name

include_role

Ansible Version

$ ansible --version
ansible [core 2.12.2]
  config file = None
  configured module search path = ['/home/dmsimard/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/dmsimard/dev/virtualenvs/ansible/lib64/python3.10/site-packages/ansible
  ansible collection location = /home/dmsimard/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/dmsimard/dev/virtualenvs/ansible/bin/ansible
  python version = 3.10.2 (main, Jan 17 2022, 00:00:00) [GCC 11.2.1 20211203 (Red Hat 11.2.1-7)]
  jinja version = 3.0.3
  libyaml = True

Configuration

$ ansible-config dump --only-changed
ANSIBLE_NOCOWS(env: ANSIBLE_NOCOWS) = True
DEFAULT_CALLBACK_PLUGIN_PATH(env: ANSIBLE_CALLBACK_PLUGINS) = ['/home/dmsimard/dev/virtualenvs/ansible/lib64/python3.10/site-packages/ara/plugins/callback']

OS / Environment

Fedora 35, though it should not be relevant to the issue.

Steps to Reproduce

I originally bumped into the issue in ara integration tests with a regular playbook (https://github.com/ansible-community/ara/issues/346) but I've included two reproducers in this gist: https://gist.github.com/dmsimard/8c77ea6db405fbf7b11e8f78f8818afb

Expected Results

The include_role failure should be interpreted as a failure (v2_runner_on_failed), NOT as 'ok' via v2_runner_on_ok.

This is my naive view from the perspective of the callback interface, I have not yet investigated why this is happening.

Actual Results

$ ansible-playbook successful.yml || echo $?
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
2022-03-22 20:38:43,314 DEBUG ara.plugins.callback.default: v2_playbook_on_start

PLAY [Reproducer that does not fail] ********************************************************************************************************************************************************************************************************************
2022-03-22 20:38:43,357 DEBUG ara.plugins.callback.default: v2_playbook_on_play_start

TASK [A successful task] ********************************************************************************************************************************************************************************************************************************
2022-03-22 20:38:43,448 DEBUG ara.plugins.callback.default: v2_playbook_on_task_start
2022-03-22 20:38:43,469 DEBUG ara.plugins.callback.default: v2_runner_on_start
ok: [localhost] => {
    "msg": "Hello o/"
}
2022-03-22 20:38:43,480 DEBUG ara.plugins.callback.default: v2_runner_on_ok

TASK [Include a role that does not exist] ***************************************************************************************************************************************************************************************************************
2022-03-22 20:38:43,566 DEBUG ara.plugins.callback.default: v2_playbook_on_task_start
2022-03-22 20:38:43,609 DEBUG ara.plugins.callback.default: v2_runner_on_start
2022-03-22 20:38:43,617 DEBUG ara.plugins.callback.default: v2_runner_on_ok   # <-- should be v2_runner_on_failed
ERROR! the role 'does_not_exist' was not found in /home/dmsimard/dev/sandbox/include_role/roles:/home/dmsimard/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles:/home/dmsimard/dev/sandbox/include_role

The error appears to be in '/home/dmsimard/dev/sandbox/include_role/successful.yml': line 11, column 15, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

      include_role:
        name: does_not_exist
              ^ here

PLAY RECAP **********************************************************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
# failed should be 1 above

2022-03-22 20:38:43,693 DEBUG ara.plugins.callback.default: v2_playbook_on_stats
2 # <-- Exit code 2 (non-zero)

Code of Conduct

ansibot commented 2 years ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

mkrizek commented 2 years ago

This seems to be what @s-hertel noticed when reviewing https://github.com/ansible/ansible/pull/76928. Which is that only _load_included_file sends callbacks for include_tasks, the if branch for include_role does not:

https://github.com/ansible/ansible/blob/14098d9b045f4048e540560d3bee808cbb87e2fa/lib/ansible/plugins/strategy/linear.py#L353-L362

I started playing with it and this is what I had in `git stash`: ```diff diff --git a/lib/ansible/plugins/strategy/__init__.py b/lib/ansible/plugins/strategy/__init__.py index 9c9846a997f..85344e9f288 100644 --- a/lib/ansible/plugins/strategy/__init__.py +++ b/lib/ansible/plugins/strategy/__init__.py @@ -1026,6 +1026,45 @@ class StrategyBase: display.debug("done processing included file") return block_list + def _load_included_role(self, included_file, iterator): + """Wrapper around getting a block list from a role that increments + Play stats and sends callbacks for an include role. + + Raises AnsibleError exception in case of a failure during including a role, + in such case the caller is responsible for marking the host(s) as failed + using PlayIterator.mark_host_failed(). + + FIXME: duplicates some of the code from _load_included_file + """ + try: + new_ir = self._copy_included_file(included_file) + + block_list, dummy = new_ir.get_block_list( + play=iterator._play, + variable_manager=self._variable_manager, + loader=self._loader, + ) + + for host in included_file._hosts: + self._tqm._stats.increment('ok', host.name) + except AnsibleParserError: + raise + except AnsibleError as e: + if isinstance(e, AnsibleFileNotFound): + reason = "Could not find or access '%s' on the Ansible Controller." % to_text(e.file_name) + else: + reason = to_text(e) + + for host in included_file._hosts: + tr = TaskResult(host=host, task=included_file._task, return_data=dict(failed=True, reason=reason)) + self._tqm._stats.increment('failures', host.name) + self._tqm.send_callback('v2_runner_on_failed', tr) + raise AnsibleError(reason) from e + + self._tqm.send_callback('v2_playbook_on_include', included_file) + display.debug("done processing included file") + return block_list + def run_handlers(self, iterator, play_context): ''' Runs handlers on those hosts which have been notified. diff --git a/lib/ansible/plugins/strategy/free.py b/lib/ansible/plugins/strategy/free.py index 475b7efcf4a..0418da86714 100644 --- a/lib/ansible/plugins/strategy/free.py +++ b/lib/ansible/plugins/strategy/free.py @@ -248,18 +248,13 @@ class StrategyModule(StrategyBase): display.debug("collecting new blocks for %s" % included_file) try: if included_file._is_role: - new_ir = self._copy_included_file(included_file) - - new_blocks, handler_blocks = new_ir.get_block_list( - play=iterator._play, - variable_manager=self._variable_manager, - loader=self._loader, - ) + new_blocks = self._load_included_role(included_file, iterator=iterator) else: new_blocks = self._load_included_file(included_file, iterator=iterator) except AnsibleParserError: raise except AnsibleError as e: + display.error(to_text(e), wrap_text=False) for r in included_file._results: r._result['failed'] = True diff --git a/lib/ansible/plugins/strategy/linear.py b/lib/ansible/plugins/strategy/linear.py index d90d347d3e3..cd7426686d4 100644 --- a/lib/ansible/plugins/strategy/linear.py +++ b/lib/ansible/plugins/strategy/linear.py @@ -351,13 +351,7 @@ class StrategyModule(StrategyBase): # list of noop tasks, to make sure that they continue running in lock-step try: if included_file._is_role: - new_ir = self._copy_included_file(included_file) - - new_blocks, handler_blocks = new_ir.get_block_list( - play=iterator._play, - variable_manager=self._variable_manager, - loader=self._loader, - ) + new_blocks = self._load_included_role(included_file, iterator=iterator) else: new_blocks = self._load_included_file(included_file, iterator=iterator) @@ -384,6 +378,7 @@ class StrategyModule(StrategyBase): except AnsibleParserError: raise except AnsibleError as e: + display.error(to_text(e), wrap_text=False) for r in included_file._results: r._result['failed'] = True ```
MaxBidlingmaier commented 10 months ago

Still an issue as off ansible-core 2.16.1. A failed ansible.builtin.include_tasks (due to target file not existing):

So this results in two log entries that should be one. So +1 here