ansible-middleware / amq

A collection to manage AMQ brokers
Apache License 2.0
13 stars 11 forks source link

Offline installation fails when a remote user is defined #123

Closed rmarting closed 1 month ago

rmarting commented 1 month ago
SUMMARY

The offline installation is falling when the playbook is trying to check the local download archive path with an exception related to an incorrect password. The local user used to execute the playbook is different to the remote user to connect to the remote servers.

Full stack trace:

on 🎩 ❯ ansible-playbook -i inventories/my-local-vm playbooks/amq_broker_ha.yml
BECOME password: 

PLAY [Ansible Playbook to install a high-availability shared storage AMQ Broker (live and backup)] ****************

TASK [Gathering Facts] ********************************************************************************************
ok: [f38mw02]
ok: [f38mw01]

TASK [middleware_automation.amq.activemq : Validating arguments against arg spec 'main'] **************************
ok: [f38mw01]
ok: [f38mw02]

TASK [middleware_automation.amq.activemq : Check prerequisites] ***************************************************
included: /home/rmarting/.ansible/collections/ansible_collections/middleware_automation/amq/roles/activemq/tasks/prereqs.yml for f38mw01, f38mw02

TASK [middleware_automation.amq.activemq : Clear internal templating variables] ***********************************
ok: [f38mw01]
ok: [f38mw02]

TASK [middleware_automation.amq.activemq : Validate credentials] **************************************************
ok: [f38mw01]
ok: [f38mw02]

TASK [middleware_automation.amq.activemq : Validate TLS config] ***************************************************
skipping: [f38mw01]
skipping: [f38mw02]

TASK [middleware_automation.amq.activemq : Validate TLS mutual auth config] ***************************************
skipping: [f38mw01]
skipping: [f38mw02]

TASK [middleware_automation.amq.activemq : Check local download archive path] *************************************
fatal: [f38mw01 -> localhost]: FAILED! => {"changed": false, "module_stderr": "\nSorry, try again.\n[sudo via ansible, key=wmpdbqmdxqqzqbpajbwutgxmveljwwli] password:\nsudo: timed out reading password\nsudo: 1 incorrect password attempt\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

NO MORE HOSTS LEFT ************************************************************************************************

PLAY RECAP ********************************************************************************************************
f38mw01                    : ok=5    changed=0    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0   
f38mw02                    : ok=5    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
ISSUE TYPE
ANSIBLE VERSION
ansible [core 2.16.6]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/rmarting/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.12/site-packages/ansible
  ansible collection location = /home/rmarting/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.12.3 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)] (/usr/bin/python3)
  jinja version = 3.1.4
  libyaml = True
COLLECTION VERSION
on 🎩 ❯ ansible-galaxy collection list

# /home/rmarting/.ansible/collections/ansible_collections
Collection                               Version
---------------------------------------- -------
ansible.posix                            1.5.4  
community.hashi_vault                    6.0.0  
fedora.linux_system_roles                1.78.2 
kubernetes.core                          3.0.0  
middleware_automation.amq                2.0.1  
middleware_automation.amq_streams        0.0.6  
middleware_automation.common             1.2.1  
redhat.amq_streams                       1.0.0  
redhat.runtimes_common                   1.1.3  
STEPS TO REPRODUCE

The playbook is executed from a laptop to install into 2 virtual machines running the amq broker binaries. Before running the playbook, a copy of the binaries was copied into the /tmp folder of the local server, it was also copied into the /tmp folder of each VM. In all the cases the issue was the same:

The user to connect to the remote VM is different from the user connected into the local server where the playbook is executed.

---
- name: "Ansible Playbook to install a high-availability shared storage AMQ Broker (live and backup)"
  hosts: all
  remote_user: rhmw
  gather_facts: yes
  vars:
    # Offline Installation
    activemq_offline_install: true
    activemq_local_archive_repository: "/tmp"
  collections:
    - middleware_automation.amq
  roles:
    - activemq
EXPECTED RESULTS

Offline intallation using the remote_user to connect to the remote servers and coping the local binary from the bastion without any issue.

ACTUAL RESULTS

Installation failing with an error in the Check local download archive path:

TASK [middleware_automation.amq.activemq : Check local download archive path] *************************************
fatal: [f38mw01 -> localhost]: FAILED! => {"changed": false, "module_stderr": "\nSorry, try again.\n[sudo via ansible, key=wmpdbqmdxqqzqbpajbwutgxmveljwwli] password:\nsudo: timed out reading password\nsudo: 1 incorrect password attempt\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
rmarting commented 1 month ago

The local configuration of ansible is (ansible.cfg file):

[defaults]
host_key_checking = False
retry_files_enabled = False
nocows = 1

[inventory]
# fail more helpfully when the inventory file does not parse (Ansible 2.4+)
unparsed_is_failed = true

[galaxy]
server_list = automation_hub,galaxy

[galaxy_server.galaxy]
url = https://galaxy.ansible.com/

[galaxy_server.automation_hub]
url = https://cloud.redhat.com/api/automation-hub/
auth_url = https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token
token = ***************************************************

[privilege_escalation]
become = true
become_method = sudo
become_ask_pass = true
guidograzioli commented 1 month ago

you problem is on the controller (localhost) not the target host (ie. it's trying to escalate to read from /tmp/). One workaround is to run on the controller with a no password sudoer. Also wierd that it timeout waiting for the sudo password, did you get the prompt at all?

rmarting commented 1 month ago

I am using my own laptop (with my own user), as I usually do with other collections (e.g.: amq_streams), just because I did not get any issue. It seems that the offline installation between amq_streams and activemq collections are different.

I am not getting any prompt, so after sometime I got the error.