After upgrading to Ansible 2.16.2, existing playbooks that run against multiple (4-12) vCenters at the same time in the Inventory will intermittently hang on various appliance related tasks, such as timezone, ntp, or proxy/noproxy.
Reverting to Ansible 2.14 works as expected with no playbook modification, or able to get 2.16 to work with serial: 1 defined, although it is slower and less efficient.
- name: VMware vCenter Appliance Configuration
hosts: vCenters
gather_facts: false
tasks:
- name: DNS Servers
# https://github.com/ansible-collections/vmware.vmware_rest/blob/main/docs/vmware.vmware_rest.appliance_networking_dns_servers_module.rst
vmware.vmware_rest.appliance_networking_dns_servers:
vcenter_hostname: "{{ inventory_hostname }}"
vcenter_username: "{{ vcenter_user }}"
vcenter_password: "{{ vcenter_password }}"
vcenter_validate_certs: "{{ vmware_validate_certs }}"
servers: "{{ dns_servers }}"
mode: is_static
delegate_to: localhost
register: result
until: result is not failed
retries: 2
delay: 60
tags:
- dns
- name: Timezone
vmware.vmware_rest.appliance_system_time_timezone:
vcenter_hostname: "{{ inventory_hostname }}"
vcenter_username: "{{ vcenter_user }}"
vcenter_password: "{{ vcenter_password }}"
vcenter_validate_certs: "{{ vmware_validate_certs }}"
name: America/Chicago
delegate_to: localhost
register: result
until: result is not failed
retries: 2
delay: 60
tags:
- timezone
- name: NTP configuration
# https://github.com/ansible-collections/vmware.vmware_rest/blob/main/docs/vmware.vmware_rest.appliance_ntp_module.rst
vmware.vmware_rest.appliance_ntp:
vcenter_hostname: "{{ inventory_hostname }}"
vcenter_username: "{{ vcenter_user }}"
vcenter_password: "{{ vcenter_password }}"
vcenter_validate_certs: "{{ vmware_validate_certs }}"
servers: "{{ ntp_servers }}"
delegate_to: localhost
register: result
until: result is not failed
retries: 2
delay: 60
tags:
- ntp
notify: Restart the ntpd service
- name: Enable NTP time sync
# https://github.com/ansible-collections/vmware.vmware_rest/blob/main/docs/vmware.vmware_rest.appliance_timesync_module.rst
vmware.vmware_rest.appliance_timesync:
vcenter_hostname: "{{ inventory_hostname }}"
vcenter_username: "{{ vcenter_user }}"
vcenter_password: "{{ vcenter_password }}"
vcenter_validate_certs: "{{ vmware_validate_certs }}"
mode: NTP
delegate_to: localhost
register: result
until: result is not failed
retries: 2
delay: 60
tags:
- ntp
- name: Network Proxy - HTTP
# https://github.com/ansible-collections/vmware.vmware_rest/blob/main/docs/vmware.vmware_rest.appliance_networking_proxy_module.rst
vmware.vmware_rest.appliance_networking_proxy:
vcenter_hostname: "{{ inventory_hostname }}"
vcenter_username: "{{ vcenter_user }}"
vcenter_password: "{{ vcenter_password }}"
vcenter_validate_certs: "{{ vmware_validate_certs }}"
enabled: true
server: "{{ http_proxy }}"
port: "{{ http_proxy_port }}"
protocol: http
delegate_to: localhost
when: http_proxy is defined
register: result
until: result is not failed
# retries: 2
# delay: 60
tags:
- proxy
- http
EXPECTED RESULTS
Running against and inventory with 4-12 vCenters, all vCenter 7.0 U3 or 8.0 U2, all tasks should complete successfully and timely.
ACTUAL RESULTS
Tasks that run successfully are a bit slower than Ansible 2.14 versions, sometimes pausing for 5+ seconds on output on the screen, where earlier versions were less than 1 second.
At some point in the playbook, a task will hang after showing OK status for around 4 vCenters and no further output for the remaining vCenters. Sometimes it happens on the second task, sometimes on the 4th or 5th.
When ran in AWX, it hung for 20 hours before the task self-terminated. In the CLI, I've waited more than 15 minutes with no further output once it hangs. No errors shown, just no progress.
In this case, it successfully ran on 7 out of 12 entries before hanging on the 6th task
SUMMARY
After upgrading to Ansible 2.16.2, existing playbooks that run against multiple (4-12) vCenters at the same time in the Inventory will intermittently hang on various appliance related tasks, such as timezone, ntp, or proxy/noproxy.
Reverting to Ansible 2.14 works as expected with no playbook modification, or able to get 2.16 to work with
serial: 1
defined, although it is slower and less efficient.ISSUE TYPE
COMPONENT NAME
appliance_networking_proxy appliance_networking_noproxy appliance_system_time_timezone Possibly others
ANSIBLE VERSION
COLLECTION VERSION
CONFIGURATION
OS / ENVIRONMENT
Execution environment based on CentOS Stream9
STEPS TO REPRODUCE
EXPECTED RESULTS
Running against and inventory with 4-12 vCenters, all vCenter 7.0 U3 or 8.0 U2, all tasks should complete successfully and timely.
ACTUAL RESULTS
Tasks that run successfully are a bit slower than Ansible 2.14 versions, sometimes pausing for 5+ seconds on output on the screen, where earlier versions were less than 1 second.
At some point in the playbook, a task will hang after showing OK status for around 4 vCenters and no further output for the remaining vCenters. Sometimes it happens on the second task, sometimes on the 4th or 5th.
When ran in AWX, it hung for 20 hours before the task self-terminated. In the CLI, I've waited more than 15 minutes with no further output once it hangs. No errors shown, just no progress.
In this case, it successfully ran on 7 out of 12 entries before hanging on the 6th task