Open dsgnr opened 3 years ago
I have same problem.
Ansible: 2.10.16, 2.11.7, 2.12.1 Ansible Host OS: CentOS 7.6 / macOS 11.6.2 Remote Host OS: CentOS 7.6 / Oracle Linux 8.4 Mitogen: 0.3.2
Test playbook:
- hosts: all
name: test
become: false
gather_facts: false
tasks:
- name: Wait for system to become reachable over ssh
ansible.builtin.wait_for_connection:
connect_timeout: 3
delay: 0
sleep: 1
timeout: 12
Errors:
fatal: [my_remote_host]: FAILED! => {"changed": false, "elapsed": 12, "msg": "timed out waiting for ping module test: An attempt was made to enqueue a message with a Broker that has already exitted. It is likely your program called Broker.shutdown() too early."}
Errors with -vvvvv
:
[task 97384] 19:46:04.899147 D ansible_mitogen.planner: <class 'ansible_mitogen.planner.BinaryPlanner'> rejected 'ansible.legacy.ping'
[task 97384] 19:46:04.899681 D ansible_mitogen.planner: <class 'ansible_mitogen.planner.NewStylePlanner'> accepted 'ansible.legacy.ping' (filename '/path_to_ansible/ansible212/lib/python3.9/site-packages/ansible/modules/ping.py')
[task 97384] 19:46:04.900522 D mitogen.io: Router(Broker(2e50)).add_handler(<bound method Receiver._on_receive of Receiver(Router(Broker(2e50)), 110)>, 110, True)
[task 97384] 19:46:04.901273 D mitogen.io: Latch(0x10fcfe790, size=0, t='mitogen.Pool.1d00.0').get(timeout=None, block=True)
[task 97384] 19:46:04.902573 D mitogen.io: Latch(0x10fcfe790, size=0, t='mitogen.Pool.1d00.1').get(timeout=None, block=True)
[task 97384] 19:46:04.902942 D mitogen.service: Pool(1d00, size=2, th='MainThread'): initialized
[task 97384] 19:46:04.903925 D mitogen.io: Latch(0x10fcfe790, size=0, t='mitogen.Pool.1d00.0')._get_sleep(timeout=None, block=True, fd=44/45)
[task 97384] 19:46:04.908435 D mitogen.io: PollPoller.poll(None)
[task 97384] 19:46:04.909139 D mitogen.io: Latch(0x10fcfe790, size=0, t='mitogen.Pool.1d00.1')._get_sleep(timeout=None, block=True, fd=47/48)
[task 97384] 19:46:04.909829 D mitogen.io: PollPoller.poll(None)
[task 97384] 19:46:05.025414 D mitogen.service: caching small file /path_to_ansible/ansible212/lib/python3.9/site-packages/ansible/modules/ping.py
wait_for_connection: attempting ping module test
[task 97384] 19:46:06.026735 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:07.028123 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:08.029647 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:09.031235 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:10.032965 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:11.035038 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
wait_for_connection: attempting ping module test
[task 97384] 19:46:12.037328 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
[task 97384] 19:46:13.039553 D ansible_mitogen.mixins: _remove_tmp_path(None)
[task 97384] 19:46:13.040407 D ansible_mitogen.mixins: _remove_tmp_path(None)
[task 97384] 19:46:13.041674 D mitogen.parent: starting no-reply function call to 'ssh.xx.xx.xx.xx': mitogen.core.Dispatcher.forget_chain('my_ansible_host_id')
fatal: [my_remote_host]: FAILED! => {
"changed": false,
"elapsed": 12,
"msg": "timed out waiting for ping module test: An attempt was made to enqueue a message with a Broker that has already exitted. It is likely your program called Broker.shutdown() too early."
}
Oh no, I just realized that the problem is only with FQCN ansible.builtin.wait_for_connection
. Mitogen works well with 'old' wait_for_connection
.
FWIW, this simple change to mixins.py resolves this;
--- mitogen-0.3.7.a/ansible_mitogen/mixins.py 2024-04-09 08:49:41.000000000 +1000
+++ mitogen-0.3.7/ansible_mitogen/mixins.py 2024-07-19 10:36:14.696669850 +1000
@@ -379,7 +379,10 @@
# wait_for_connection, the `ping` test from Ansible won't pass because we lost connection
# clearing out context forces a reconnect
# see https://github.com/dw/mitogen/issues/655 and Ansible's `wait_for_connection` module for more info
- if module_name == 'ansible.legacy.ping' and type(self).__name__ == 'wait_for_connection':
+ if module_name == 'ansible.legacy.ping' and type(self).__name__ in [
+ 'wait_for_connection',
+ 'ansible.legacy.wait_for_connection',
+ 'ansible.builtin.wait_for_connection']:
self._connection.context = None
self._connection._connect()
It appears
wait_for_connection
on RC 3.0 does not work. I have also tested this with the master branch, and the same issue occurs.Responds with the following after a short amount of time (around 5 seconds):