ansible-collections / cisco.ios

Ansible Network Collection for Cisco IOS
GNU General Public License v3.0
262 stars 161 forks source link

set_fact module breaks for Cisco n/w switches after upgrading ansible-core from 2.14.0 to 2.14.12 #1013

Closed shaaga closed 4 months ago

shaaga commented 4 months ago
SUMMARY

We have AWX v17.1.0 running and recently to fix a security item, we upgraded ansible-core from 2.14.0 to 2.14.12 for the ansible venv present in AWX containers at path /var/lib/awx/venv/ansible. Post upgrade, existing job templates which connects to Cisco switches started failing with below error.

We use v2.7.1 of ansible collection cisco_nxos

ISSUE TYPE
COMPONENT NAME

cisco_nxos

ANSIBLE VERSION
ansible [core 2.14.12]
  config file = None
  configured module search path = ['/var/lib/awx/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /var/lib/awx/venv/ansible/lib/python3.9/site-packages/ansible
  ansible collection location = /var/lib/awx/.ansible/collections:/usr/share/ansible/collections
  executable location = /var/lib/awx/venv/ansible/bin/ansible
  python version = 3.9.18 (main, Aug 25 2023, 13:20:14) [GCC 11.4.0] (/var/lib/awx/venv/ansible/bin/python3)
  jinja version = 3.1.3
  libyaml = True
COLLECTION VERSION
cisco_nxos v2.7.1
STEPS TO REPRODUCE

Upgrade ansible-core for "/var/lib/awx/venv/ansible" virtual env in awx container from 2.14.0 to 2.14.12 and use cisco_nxos v2.7.1 and write a job template like below

- hosts: "elf"
  gather_facts: no
  connection: network_cli
  vars:
    ansible_user: "{{ network_user }}"
    ansible_password: "{{ network_password }}"
    ansible_become_password: "{{ network_password }}"
    ansible_command_timeout: 600
    ansible_connect_timeout: 600
    operation: "{{ action }}"
    tenant_type: "{{ 'stretched-cluster' if features.get('stretched_cluster') else 'traditional' }}"
  tasks:
    - name: Setting releases as fact
      set_fact:
        releases: "{{ releases }}"
EXPECTED RESULTS

image

ACTUAL RESULTS

image


{
"msg": "Unexpected failure during module execution: maximum recursion depth exceeded while getting the str of an object",
"exception": "Traceback (most recent call last):\n File "/var/lib/awx/venv/ansible/lib/python3.9/site-packages/ansible/executor/task_executor.py", line 158, in run\n res = self._execute()\n File "/var/lib/awx/venv/ansible/lib/python3.9/site-packages/ansible/executor/task_executor.py", line 580, in _execute\n socket_path = start_connection(self._play_context, options, self._task._uuid)\n File "/var/lib/awx/venv/ansible/lib/python3.9/site-packages/ansible/executor/task_executor.py", line 1203, in start_connection\n write_to_file_descriptor(master, options)\n File "/var/lib/awx/venv/ansible/lib/python3.9/site-packages/ansible/module_utils/connection.py", line 58, in write_to_file_descriptor\n src = cPickle.dumps(obj, protocol=0)\n File "/var/lib/awx/venv/ansible/lib/python3.9/copyreg.py", line 71, in _reduce_ex\n state = base(self)\nRecursionError: maximum recursion depth exceeded while getting the str of an object\n",
"stdout": "",
"_ansible_no_log": false
}
shaaga commented 4 months ago

The issue seems similar to what is mentioned in the following issue but the fix suggests downgradng execution env in AWX but my AWX version is 17.1.0 and it has no concept of execution env

https://github.com/ansible/awx/issues/14689 @gundalow , @Dustin-Wi @ZachHoiberg

Dustin-Wi commented 4 months ago

I believe the underlying cause of my issue was that the execution environment I was using had a version of Ansible installed with this bug, and a new execution environment was created with a working version of ansible.

Do you have to use Ansible version 2.14.12? Can you try newer version?

shaaga commented 4 months ago

I upgraded ansible-core version from 2.14.0 to 2.14.2 after which I started seeing the issue, although post upgrading ansible pip pkg as well from 7.0.0 to 8.5.0, the issue is resolved