ansible / ansible

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.
https://www.ansible.com/
GNU General Public License v3.0
63.09k stars 23.93k forks source link

SSH mux does not distinguish different inventory files #84148

Open yuxiaolejs opened 1 month ago

yuxiaolejs commented 1 month ago

Summary

I have to run an Ansible playbook on two sets of machines; each of them has an inventory.ini file that identifies them using their SSH config (the rest is the same).

When I actually ran it, I found that it ended up as the same playbook was run twice on the same set of machines (first inventory file used). After digging using Wireshark, I saw it only established a connection to the bastion of the first set of machines, and I could only see one ssh mux process in the background.

Issue Type

Bug Report

Component Name

ssh

Ansible Version

$ ansible --version

ansible [core 2.16.3]
  config file = None
  configured module search path = ['/home/unics/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  ansible collection location = /home/unics/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True

### Configuration

```console
# if using a version older than ansible-core 2.12 you should omit the '-t all'
$ ansible-config dump --only-changed -t all
CONFIG_FILE() = None
PAGER(env: PAGER) = less

OS / Environment

Ubuntu 24.04

Steps to Reproduce

Playbook:

- name: Install openvpn
  hosts: bastion
  gather_facts: true
  become: true
  tasks:
    - name: debug host
      debug:
        var: ansible_ssh_common_args
    - name: Check if openvpn@bastion service exists
      systemd:
        name: openvpn@bastion
        state: started 
      register: service_status
      ignore_errors: yes 

    - debug:
        var: service_status

Inventory files (two are the same except ssh_conf path)

[all:vars]
ansible_ssh_common_args="-F ./vpc-unics-office/ssh_config -o ControlMaster=no"
global_comm_password="xxxx"
global_comm_ip="34.221.xx.xx"
vpc_cidr="10.1.0.0/16"
bastion_ip="10.1.100.10"

[static]
router ansible_host=router ansible_user=ubuntu
bastion ansible_host=bastion ansible_user=ubuntu
logger ansible_host=logger ansible_user=ubuntu

SSH config (two are the same except public IP and identity file)

Host *
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null
   User ubuntu

Host bastion 52.27.xx.xx
    HostName 52.27.xx.xx
    IdentityFile ./vpc-unics-office/id_rsa

Shell script for running it

ANSIBLE_FACT_CACHING=none ansible-playbook --inventory vpc-unics-office/inventory.ini inf/ansible/vpn_openvpn.yml --extra-vars vpc=unics-office --ssh-common-args='-o ControlMaster=no'
ANSIBLE_FACT_CACHING=none ansible-playbook --inventory vpc-unics-cloud/inventory.ini inf/ansible/vpn_openvpn.yml --extra-vars vpc=unics-cloud --flush-cache --ssh-common-args='-o ControlMaster=no'

I have OpenVPN up and running on one of the host but not even installed on the other. This shell script always gives the same result, both running or both dne, depending on which inventory used first.

Expected Results

It should connect to the correct machine and give the true result. For the machine that has OpenVPN running, it should say OK, and for the machine without OpenVPN installed, it should report the error detail but not failing (since I set it to ignore errors). Since I have configured OpenVPN only on one machine, the result shouldn't be the same.

Actual Results

They are the same; either both are shown as running or both are shown as doesn't exist, which is not the truth. Due to the concern for private info in the -vvvv output, I'd rather not post it here.

Code of Conduct

ansibot commented 1 month ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the component bot command.

yuxiaolejs commented 1 month ago

Some insights

After some investigation, I noticed that my particular setup led to this problem due to the two features of ansible:

Thus, the second command of that shell script will reuse the ssh connection of the first one (as designed) since they have the same control path.

The solution

For my specific case, the solution ended up very straightforward; I just set the control path dir manually to make sure they were different. The modified script looks like this:

ANSIBLE_SSH_CONTROL_PATH_DIR="ansiblecp1" ansible-playbook --inventory vpc-unics-office/inventory.ini inf/ansible/vpn_openvpn.yml --extra-vars vpc=unics-office
ANSIBLE_SSH_CONTROL_PATH_DIR="ansiblecp2" ansible-playbook --inventory vpc-unics-cloud/inventory.ini inf/ansible/vpn_openvpn.yml --extra-vars vpc=unics-cloud

Before edit, I was only setting control path instead of control path dir, which lead it to a crash when a playbook is meant to be ran on multiple hosts.

Suggestions

Could we find a better way of calculating the control path? (For example, should we include the ssh_common_args in the hash or resolve the real IP of the server instead of only the alias?)

sivel commented 1 month ago

Related: https://github.com/ansible/ansible/issues/76956