ansible-collections / ansible.windows

Windows core collection for Ansible
https://galaxy.ansible.com/ansible/windows
GNU General Public License v3.0
243 stars 164 forks source link

Not able to run the win_update task on Windows server when Trellix Threat prevention enabled with Ansible version 2.16.6 #635

Closed rajmartha26 closed 2 weeks ago

rajmartha26 commented 1 month ago
SUMMARY

We are experiencing an issue where the win_update task fails to run on Windows servers when Trellix Threat Prevention is enabled. This issue started occurring after upgrading to Ansible version 2.16.6. The same playbook was working fine with Ansible version 2.9.7.

ISSUE TYPE
COMPONENT NAME

win_updates

ANSIBLE VERSION

Works with below Ansible version

ansible 2.9.27
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.18 (default, Dec 18 2023, 22:08:43) [GCC 7.3.1 20180712 (Red Hat 7.3.1-17)]

Does not work with below Ansible version

ansible [core 2.16.2]
  config file = None
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /root/.local/lib/python3.11/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /root/.local/bin/ansible
  python version = 3.11.7 (main, Jan 26 2024, 15:26:41) [GCC 8.5.0 20210514 (Red Hat 8.5.0-21)] (/usr/bin/python)
  jinja version = 3.1.2
  libyaml = True
COLLECTION VERSION
Collection      Version
--------------- -------
ansible.windows 2.2.0
CONFIGURATION

N/A

OS / ENVIRONMENT

Windows server OS: Windows 2022 Trellix agent and threat prevtion installed: Yes

STEPS TO REPRODUCE
---
- hosts: all, localhost
  gather_facts: yes
  vars:
    http_port: 80
    https_port: 443
    max_clients: 200
    ansible_python_interpreter: auto_silent
  tasks:    
    - name: Print Info
      debug:
        msg:
          - "The operating system family is {{ ansible_os_family }}"
          - "The operating system is {{ ansible_distribution }}"
          - "The operating system major version is {{ ansible_distribution_major_version }}"
    - name: Install all security, critical, and rollup updates without a scheduled task
      win_updates:
        category_names:
          - SecurityUpdates
          - CriticalUpdates
          - UpdateRollups

RUn the above playbook from Ansible version 2.16.2 on Windows 2022 server when Trellix agent installed and threat prrevetion enabled and playbook is faild with below error

PLAY RECAP *********************************************************************
windows 2022 server : ok=2 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

I added below code as per Ansible Documentation, but still getting the same error

become: true
become_method: runas
become_user: SYSTEM
EXPECTED RESULTS

playbook should be successful

ACTUAL RESULTS
The full traceback is:
Traceback (most recent call last):
  File "/somefolder/.local/.local/lib/python3.11/site-packages/ansible_collections/ansible/windows/plugins/action/win_updates.py", line 679, in run
    result = self._run_sync(task_vars, module_options, reboot, reboot_timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/somefolder/.local/.local/lib/python3.11/site-packages/ansible_collections/ansible/windows/plugins/action/win_updates.py", line 752, in _run_sync
    update_result = self._run_updates(task_vars, module_options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/somefolder/.local/.local/lib/python3.11/site-packages/ansible_collections/ansible/windows/plugins/action/win_updates.py", line 848, in _run_updates
    start_result = self._execute_win_updates(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/somefolder/.local/.local/lib/python3.11/site-packages/ansible_collections/ansible/windows/plugins/action/win_updates.py", line 946, in _execute_win_updates
    raise _ReturnResultException(msg, exception=result.get('exception', None), **extra_result)
ansible_collections.ansible.windows.plugins.action.win_updates._ReturnResultException: MODULE FAILURE
See stdout/stderr for the exact error
fatal: [windows 2022 server]: FAILED! => {
    "changed": false,
    "failed_update_count": 0,
    "filtered_updates": {},
    "found_update_count": 0,
    "installed_update_count": 0,
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 0,
    "updates": {}
}

PLAY RECAP *********************************************************************
windows 2022 server : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
rajmartha26 commented 1 month ago

i also added the post here

https://github.com/ansible/ansible/issues/83625

rajmartha26 commented 1 month ago

Any one? any update on this issue?

rajmartha26 commented 1 month ago

@jborean93 can you look at this issue when you get a chance?

jborean93 commented 1 month ago

Unfortunately if your AV is blocking our process from running there's not much we can do about it. We rely on being able to start the process in the background to keep the updates installing in case there are network reboots and to get the intermediate output back. You will most likely have to look into Trellix to see how it can be configured to get this to run properly.

rajmartha26 commented 1 month ago

@jborean93 But same script works with the 2.9.27 while the AV running on the target server, so what is changed from 2.9.27 to 2.16.6? in 2.16.6 this script sending some thing new to target windows server which resulting AV to block the script to run,

s-hertel commented 1 month ago

I see you have a config file for 2.9 but not 2.16, have you tried copying that? This seems environment-specific, but you could debug possible differences between versions by running your reproducer against some ansible-core versions between 2.9 and 2.16 to narrow down a specific ansible-core version. Then you could check the release notes for any related changes, or try debugging potential differences by increasing the verbosity -vvvv and enabling https://docs.ansible.com/ansible/latest/reference_appendices/config.html#default-debug (very noisy) to compare diffs with the version directly before the change.

rajmartha26 commented 1 month ago

@s-hertel we are using the same config file for both 2.9 and 2.16. i tryed to degug all possible ways, but stills ame issue,

jborean93 commented 1 month ago

But same script works with the 2.9.27 while the AV running on the target server, so what is changed from 2.9.27 to 2.16.6

The way the module was executed was changed in some older version of this collection which came out after 2.9. The change was made to fix a few issues like

The change means that Ansible starts a task a bit differently from how it worked in Ansible 2.9 and unfortunately in your case it's doing it in a way that your AV doesn't like. We aren't doing anything problematic, just kicking off a scheduled task and trying to run a child process off that.

Unfortunately I don't really have much advice for you here as we are at the mercy of what your AV is doing. If it kills our task we don't have any control over that.

There are some things you can try:

Running with async still uses the new code but it doesn't run it as a background process which Ansible polls for updates. This might be enough to satisfy your AV but I cannot guarantee that. If the 2.9 code works for you then you can still take a copy of both the action plugin and module code and rename it to something else. By calling this you'll go back to the old behaviour but keep in mind it will have no more updates from us.

rajmartha26 commented 1 month ago

Hi, Tried with async and still smae behaviour.

jborean93 commented 1 month ago

Unfortunately there is little else I can do here, we are running code that your Antivirus doesn't like. There are legitimate reasons why we run it like this unfortunately.

jborean93 commented 2 weeks ago

Closing as per the above.