rundeck-plugins / ansible-plugin

Ansible Integration for Rundeck
MIT License
331 stars 100 forks source link

can't reconnect after reboot #361

Closed olwins closed 3 months ago

olwins commented 6 months ago

Hi

I have a playbook that patch a remote server, it work without issue when started manually using ansible-playbook.

But when running with rundesk on the same server , the playbook hang in the reboot task each time

Ansible playbook (this task is enough to reproduce the problem)

- name: Reboot the server 
  ansible.builtin.reboot:
    msg: "Reboot initiated by Ansible for linux patching"
    connect_timeout: 20 
    reboot_timeout: 900
    pre_reboot_delay: 10
    post_reboot_delay: 30
    test_command: uptime

It look like, it not able to properly reconnect after the reboot

ansible.builtin.reboot: attempting to get system boot time
sending connection check: [b'ssh', b'-C', b'-o', b'ControlMaster=auto', b'-o', b'ControlPersist=60s', b'-o', b'IdentityFile="/tmp/rundeck/ansible-runner1205336585210018520id_rsa"', b'-o', b'KbdInteractiveAuthentication=no', b'-o', b'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', b'-o', b'PasswordAuthentication=no', b'-o', b'User="itansible"', b'-o', b'ConnectTimeout=10', b'-o', b'StrictHostKeyChecking=accept-new', b'-o', b'ServerAliveInterval=30', b'-o', b'ControlPath="/var/lib/rundeck/.ansible/cp/f4829f47ff"', b'-O', b'check', b'testserver']
No connection to reset: Control socket connect(/var/lib/rundeck/.ansible/cp/f4829f47ff): No such file or directory
<testserver> ESTABLISH SSH CONNECTION FOR USER: itansible
<testserver> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/tmp/rundeck/ansible-runner1205336585210018520id_rsa"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="itansible"' -o ConnectTimeout=10 -o StrictHostKeyChecking=accept-new -o ServerAliveInterval=30 -o 'ControlPath="/var/lib/rundeck/.ansible/cp/f4829f47ff"' -tt testserver '/bin/sh -c '"'"'sudo -H -S -p "[sudo via ansible, key=ylstsxfvxidltvukzexjnjqasejqauun] password:" -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-ylstsxfvxidltvukzexjnjqasejqauun ; cat /proc/sys/kernel/random/boot_id'"'"'"'"'"'"'"'"' && sleep 0'"'"''
<testserver> (255, b'', b'itansible@testserver: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).\r\n')

It retry every 30/40 sec, always with the same error after a while, there is only one additional line, the socket seems to be removed also : o connection to reset: Control socket connect(/var/lib/rundeck/.ansible/cp/f4829f47ff): No such file or directory

in rundeck, ansible is configured to use a ssh key + passphrase (in the vault), and a root password also in the vault

project.ansible-become-method=sudo
project.ansible-become-password-storage-path=keys/project/TEST_PATCHING_LINUX/password-itansible-root
project.ansible-become=true
project.ansible-binaries-dir-path=/opt/ansible/.venv/bin
project.ansible-config-file-path=/opt/ansible/ansible.cfg
project.ansible-executable=/bin/bash
project.ansible-generate-inventory=true
project.ansible-ssh-auth-type=privateKey
project.ansible-ssh-keypath=/var/lib/rundeck/.ssh/id_ed25519
project.ansible-ssh-passphrase-option=option.password
project.ansible-ssh-passphrase-storage-path=keys/project/TEST_PATCHING_LINUX/Pass_itmasteransible
project.ansible-ssh-use-agent=true
project.ansible-ssh-user=itansible

I try to modify a few ssh settings, but it didn't change anything

olwins commented 6 months ago

Edit : It work if I redefine all project variable at the job level

may be something is lost during the retry ?

Job hung

"configuration" : { "ansible-base-dir-path" : "/opt/ansible", "ansible-become" : "true", "ansible-become-method" : "sudo", "ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root", "ansible-playbook" : "run_patching.yml", "ansible-ssh-passphrase-option" : "option.password", "ansible-ssh-use-agent" : "false" },

Job succeeded (basically I set manually the same value that the one define at the project level):

"configuration" : { "ansible-base-dir-path" : "/opt/ansible", "ansible-become" : "true", "ansible-become-method" : "sudo", "ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root", "ansible-playbook" : "run_patching.yml", "ansible-ssh-auth-type" : "privateKey", "ansible-ssh-keypath" : "/var/lib/rundeck/.ssh/id_ed25519", "ansible-ssh-passphrase-option" : "option.password", "ansible-ssh-passphrase-storage-path" : "keys/project/TEST_PATCHING_LINUX/Pass_itmasteransible", "ansible-ssh-use-agent" : "true", "ansible-ssh-user" : "itansible" },

olwins commented 6 months ago

root cause found

I thought that by default the ansible-ssh-use-agent value would be set to the one defined at the project level (true in my case) But when I create a new job, it is automatically set to false

    "ansible-ssh-use-agent" : "false"

Set those value for the job are enough 👍

"ansible-base-dir-path" : "/opt/ansible", "ansible-become" : "true", "ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root", "ansible-playbook" : "test_patching.yaml", "ansible-ssh-passphrase-option" : "option.password", "ansible-ssh-use-agent" : "true"

olwins commented 3 months ago

Seem fixed in the latest version