OpenNebula / one-deploy

Apache License 2.0
23 stars 10 forks source link

Leader detection may try incorrect SSH user in HA mode #56

Closed sk4zuzu closed 3 months ago

sk4zuzu commented 3 months ago

Description The following inventory:

---
all:
  vars:
    ansible_user: ubuntu
    ensure_keys_for: [root]
    ensure_hosts: true
    one_pass: asd
    one_version: '6.8'
    ds: { mode: ssh }
    one_vip: 10.2.50.86
    one_vip_cidr: 24
    one_vip_if: eth0

infra:
  vars:
    os_image_url: https://d24fmfybwxpuhu.cloudfront.net/ubuntu2204-6.10.0-1-20240514.qcow2
    os_image_size: 20G
    infra_bridge: br0
  hosts:
    n1a1: { ansible_host: 10.2.50.10 }

frontend:
  vars:
    context:
      ETH0_DNS: 10.2.50.1
      ETH0_GATEWAY: 10.2.50.1
      ETH0_MASK: 255.255.255.0
      ETH0_NETWORK: 10.2.50.0
      ETH0_IP: "{{ ansible_host }}"
      PASSWORD: opennebula
      SSH_PUBLIC_KEY:  |
        ssh-rsa ...
  hosts:
    f1: { ansible_user: root, ansible_host: 10.2.50.100, infra_hostname: n1a1 }
    f2: { ansible_user: root, ansible_host: 10.2.50.101, infra_hostname: n1a1 }

node:
  hosts:
    n1a1: { ansible_host: 10.2.50.10 }

Causes this errror:

TASK [opennebula.deploy.opennebula/leader : Get Zone] *********************************************************************************************
task path: /stor/asd/_git/one-deploy/ansible_collections/opennebula/deploy/roles/opennebula/leader/tasks/main.yml:27
Friday 14 June 2024  09:35:42 +0200 (0:00:00.315)       0:00:17.374 ***********
Using module file /home/asd/.cache/pypoetry/virtualenvs/one-deploy-zyWWq5iB-py3.11/lib/python3.11/site-packages/ansible/modules/command.py
Pipelining is enabled.
<10.2.50.86> ESTABLISH SSH CONNECTION FOR USER: ubuntu
<10.2.50.86> SSH: EXEC ssh -q -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=30 -o 'ControlPath="/home/asd/.ansible/cp/cfeaabda9e"' 10.2.50.86 '/bin/sh -c '"'"'sudo -H -S -n  -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-olhlneagxydwxjhonwzahjtejgjcdsec ; /usr/bin/python3'"'"'"'"'"'"'"'"' && sleep 0'"'"''
Using module file /home/asd/.cache/pypoetry/virtualenvs/one-deploy-zyWWq5iB-py3.11/lib/python3.11/site-packages/ansible/modules/command.py
Pipelining is enabled.
<10.2.50.86> ESTABLISH SSH CONNECTION FOR USER: ubuntu
<10.2.50.86> SSH: EXEC ssh -q -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=30 -o 'ControlPath="/home/asd/.ansible/cp/cfeaabda9e"' 10.2.50.86 '/bin/sh -c '"'"'sudo -H -S -n  -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-zpqlsguihcfsiaaybptojlmjtexsdwhq ; /usr/bin/python3'"'"'"'"'"'"'"'"' && sleep 0'"'"''
fatal: [f1 -> 10.2.50.86]: UNREACHABLE! => changed=false
  msg: 'Data could not be sent to remote host "10.2.50.86". Make sure this host can be reached over ssh: '
  unreachable: true
fatal: [f2 -> 10.2.50.86]: UNREACHABLE! => changed=false
  msg: 'Data could not be sent to remote host "10.2.50.86". Make sure this host can be reached over ssh: '
  unreachable: true

The problem is that when one_vip is used to contact the Leader delegate_to does not know which ansible_user should be used for the SSH connection, so it takes whatever it finds in hostvars.

To Reproduce

  1. Define "global" ansible_user in the inventory.
  2. Override ansible_user (different value) for all Front-ends (HA).
  3. Run ansible-playbook.

Expected behavior No error, delegate_to uses the correct SSH user.

Details

Additional context N/A

Progress Status