ansible-collections / ansible.netcommon

Ansible Network Collection for Common Code
GNU General Public License v3.0
140 stars 103 forks source link

Broadcast messages cause cli_command to timeout. #589

Open ptoal opened 9 months ago

ptoal commented 9 months ago
SUMMARY

When executing a command that triggers a broadcast message on Cisco IOS devices, the cli_command module will sometimes fail with a timeout. An example of a command that does this is: reload in 5. When executed on a Cisco IOS device, that command will trigger a broadcast message to be issued, eg:

labsw1#reload in 5
Reload scheduled in 5 minutes by ptoal on vty0 (192.168.1.121)
Proceed with reload? [confirm]y
labsw1#

***
*** --- SHUTDOWN in 0:05:00 ---
***
ISSUE TYPE
COMPONENT NAME

ansible.netcommon.cli_command

ANSIBLE VERSION
ansible [core 2.14.5]
  config file = /home/user/.ansible.cfg
  configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/user/.local/lib/python3.11/site-packages/ansible
  ansible collection location = /home/ptoal/collections:/usr/share/ansible/collections/ansible_collections:/usr/share/automation-controller/collections/ansible_collections:/home/collections
  executable location = /home/user/.local/bin/ansible
  python version = 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True
COLLECTION VERSION
Collection        Version
----------------- -------
ansible.netcommon 4.1.0  
CONFIGURATION
ANSIBLE_NOCOWS(/home/user/.ansible.cfg) = True
ANSIBLE_PIPELINING(/home/user/.ansible.cfg) = True
COLLECTIONS_PATHS(/home/user/.ansible.cfg) = ['/home/user/collections', '/usr/share/ansible/collections/ansible_collections', '/usr/share/automation-controller/collections/ansible_collections', '/home/collections']
CONFIG_FILE() = /home/user/.ansible.cfg
DEFAULT_FORKS(/home/user/.ansible.cfg) = 50
DEFAULT_HOST_LIST(/home/user/.ansible.cfg) = ['/home/user/inventory']
DEFAULT_POLL_INTERVAL(/home/user/.ansible.cfg) = 5
DEFAULT_ROLES_PATH(/home/user/.ansible.cfg) = ['/home/user/roles']
DEFAULT_STDOUT_CALLBACK(/home/user/.ansible.cfg) = community.general.yaml
DEFAULT_VAULT_IDENTITY_LIST(/home/user/.ansible.cfg) = ['devlab@/home/user/.toallab.vault', 'rhdemo@/home/user/.rhdemo.vault']
DEPRECATION_WARNINGS(/home/user/.ansible.cfg) = False
GALAXY_IGNORE_CERTS(/home/user/.ansible.cfg) = True
GALAXY_SERVER_LIST(/home/user/.ansible.cfg) = ['lab_published', 'rh-certified_repo', 'community_repo']
HOST_KEY_CHECKING(/home/user/.ansible.cfg) = False
INVENTORY_ENABLED(/home/user/.ansible.cfg) = ['host_list', 'community.vmware.vmware_vm_inventory', 'netbox', 'yaml', 'ini', 'auto']
PERSISTENT_CONNECT_TIMEOUT(/home/user/.ansible.cfg) = 60
RETRY_FILES_ENABLED(/home/user/.ansible.cfg) = False
TRANSFORM_INVALID_GROUP_CHARS(/home/user/.ansible.cfg) = always
OS / ENVIRONMENT

Cisco C3560E / IOS: Version 15.0(2)SE11

STEPS TO REPRODUCE

Using the playbook snippet below, it is possible to trigger a timeout every time. By using ansible_buffer_read_timeout: 2, we wait for the broadcast message. Without this setting, the command will succeed or fail somewhat randomly, based on whether or not the broadcast message is received by the module in time.

---
- name: Reload IOS Switch
  hosts: ios
  gather_facts: false
  vars:
    ansible_buffer_read_timeout: 2
    ansible_persistent_log_messages: true
    ansible_log_path: /tmp/ansible.log
  tasks:
    - name: Reload IOS Switch in 5 minutes
      ansible.netcommon.cli_command:
        command: reload in 5
        prompt:
          - Save?
          - "[confirm]"
        answer:
          - y
          - y

    - name: Cancel IOS Switch Reload
      ansible.netcommon.cli_command:
        command: reload cancel
EXPECTED RESULTS

A broadcast message by the system should be ignored while processing commands and output, to ensure reliable execution.

ACTUAL RESULTS

When executed, the playbook will fail due to a timeout. I believe this is happening because the module is being confused by the broadcast message.

When the module fails, the output from the ansible_log looks like this:

send command: b'reload in 5\r'
command: b'reload in 5'
response-1: b'r'
matched command prompt: b'r'
matched command prompt answer: b'y\r'
response-2: b'eload'
response-3: b' i'
response-4: b'n 5\r\n'
response-5: b'Reload scheduled in 5 minutes by ptoal on vty1 (192.168.1.121)\r\nProceed with reload? [confirm]'
response-6: b'y\r\nlabsw1#\r\n'
response-7: b'labsw1#'
matched cli prompt 'b'\nlabsw1#'' with regex 'b'[\\r\\n]?[\\w\\+\\-\\.:\\/\\[\\]]+(?:\\([^\\)]+\\)){0,3}(?:[>#]) ?$'' from response 'b'reload in 5\r\nReload scheduled in 5 minutes by ptoal on vty1 (192.168.1.121)\r\nProceed with reload? [confirm]y\r\nlabsw1#\r\nlabsw1#''
response-8: b'\r\n\r\n\r\n\x07***\r\n*** --- SHUTDOWN in 0:05:00 ---\r\n***\r\n'

A successful command (no timeout) does not have response-8 in the output.

ptoal commented 9 months ago

I noticed that this also impacts the cisco.ios.ios_command module.

KB-perByte commented 9 months ago

Hey @ptoal that is expected as it is unable to identify the required prompt, we can definitely update the terminal_stdout_re to accept the broadcast pattern, but we need to be sure if all the broadcast patterns from the iosxe cli are of the same pattern or it would give us false positives. Regards.