ansible / ansible-navigator

A text-based user interface (TUI) for Ansible.
https://ansible.readthedocs.io/projects/navigator/
Apache License 2.0
371 stars 100 forks source link

Mac - ssh-agent forwarding - ee container asks ssh key passphrase #1623

Open ekartsonakis opened 11 months ago

ekartsonakis commented 11 months ago
ISSUE TYPE
SUMMARY

ssh-agent forwarding in the ee container seems to be not working. It keeps asking for my passphrase on remote connections. For troubleshooting so far, I run an ansible task to sleep 1000 and then exec in the ee container to run commands like ssh-add -l , env etc.., or I exec in the ee docker image directly using the same options as ansible-navigator does.

SSH_AUTH_SOCK env variable is correctly passed in the ee and the socket is mounted:

In the ee container:

env | grep SOCK
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.DpffgxjdnV/Listeners

mount | grep DpffgxjdnV
/host_mark/private on /private/tmp/com.apple.launchd.DpffgxjdnV type fakeowner (rw,nosuid,nodev,relatime,fakeowner)

but my key is not there:

ssh-add -l
Error connecting to agent: Operation not supported

I tried @timway's suggestion https://github.com/ansible/ansible-runner/pull/1293 to add a docker option --user root but didn't help.

ANSIBLE-NAVIGATOR VERSION
ansible-navigator --version
ansible-navigator 3.5.0

Running with Docker Desktop 4.23.0 on a Mac M1 + macOS 14 Sonoma

CONFIGURATION
---

ansible-navigator:
  time-zone: local
  execution-environment:
    container-engine: docker
    enabled: True
    image: dockerhub.mycompany.com/it-docker-local/thick-ee:1.1.1
    pull:
      policy: missing
    container-options:
      - --user=root
    environment-variables:
      set:
        HOSTNAME: "myuser-ansible-navigator"
        ANSIBLE_HOME: "/runner/.ansible"
        ANSIBLE_CALLBACK_PLUGINS: "/usr/local/lib/python3.9/site-packages/ara/plugins/callback"
        ANSIBLE_REMOTE_USER: "myuser"
        ANSIBLE_SSH_PIPELINING: "true"
        ARA_API_CLIENT: "http"
        ARA_API_SERVER: "https://ara.removed.tld"
    # We want to override image wide ssh config with our local one
    volume-mounts:
      - src: "/Users/myuser/ansible_ee_home/"
        dest: "/etc/ssh/ssh_config.d/"
  logging:
    level: debug
    append: False
    file: /tmp/navigator.log
  playbook-artifact:
    enable: True
    replay: /tmp/artifact-{playbook_name}.json
    save-as: /tmp/artifact-{playbook_name}.json
LOG FILE
2023-10-06T11:07:32.827236+03:00 DEBUG 'ansible-runner.wrap_args_for_containerization' container engine invocation: docker run --rm --tty --interactive -v /Users/myuser/repos/mycompany-start/:/Users/myuser/repos/mycompany-start/ --workdir /Users/myuser/repos/mycompany-start -v /Users/myuser/repos/mycompany-start/testing/:/Users/myuser/repos/mycompany-start/testing/ -v /private/tmp/com.apple.launchd.DpffgxjdnV/:/private/tmp/com.apple.launchd.DpffgxjdnV/ -e SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.DpffgxjdnV/Listeners -v /Users/myuser/.ssh/:/home/runner/.ssh/ -v /Users/myuser/.ssh/:/root/.ssh/ -v /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/artifacts/:/runner/artifacts/:Z -v /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/:/runner/:Z -v /Users/myuser/ansible_ee_home/:/etc/ssh/ssh_config.d/ --env-file /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/artifacts/1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9/env.list --user=501 --name ansible_runner_1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9 --user=root dockerhub.company.com/mycompany-it-docker-local/mycompany-thick-ee:1.1.1 ansible-playbook /Users/myuser/repos/mycompany-start/minitest.yml -i /Users/myuser/repos/mycompany-start/testing/inventory.yml

2023-10-06T11:07:32.827303+03:00 DEBUG 'ansible-runner._handle_command_wrap' command: docker run --rm --tty --interactive -v /Users/myuser/repos/mycompany-start/:/Users/myuser/repos/mycompany-start/ --workdir /Users/myuser/repos/mycompany-start -v /Users/myuser/repos/mycompany-start/testing/:/Users/myuser/repos/mycompany-start/testing/ -v /private/tmp/com.apple.launchd.DpffgxjdnV/:/private/tmp/com.apple.launchd.DpffgxjdnV/ -e SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.DpffgxjdnV/Listeners -v /Users/myuser/.ssh/:/home/runner/.ssh/ -v /Users/myuser/.ssh/:/root/.ssh/ -v /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/artifacts/:/runner/artifacts/:Z -v /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/:/runner/:Z -v /Users/myuser/ansible_ee_home/:/etc/ssh/ssh_config.d/ --env-file /var/folders/rb/wtqs3ncd3pnbtkdyvg5y__v00000gn/T/ansible-navigator_bed0tf71/artifacts/1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9/env.list --user=501 --name ansible_runner_1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9 --user=root dockerhub.company.com/mycompany-it-docker-local/mycompany-thick-ee:1.1.1 ansible-playbook /Users/myuser/repos/mycompany-start/minitest.yml -i /Users/myuser/repos/mycompany-start/testing/inventory.yml

2023-10-06T11:08:23.054016+03:00 DEBUG 'ansible_navigator.runner.base._event_handler' ansible-runner event handle: {'event': 'verbose', 'uuid': '304a8080-8c0e-40ea-8bac-fb7c40a61cb4', 'counter': 288, 'stdout': "Enter passphrase for key '/root/.ssh/id_rsa': ", 'start_line': 374, 'end_line': 375, 'runner_ident': '1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9', 'created': '2023-10-06T08:08:23.054001'}

2023-10-06T11:08:23.086195+03:00 DEBUG 'ansible_navigator.runner.base._event_handler' ansible-runner event handle: {'uuid': 'ff9ac28d-2aab-4b58-b6aa-36361af715e3', 'counter': 289, 'stdout': '\r\nTASK [Gathering Facts] *********************************************************\r\n\x1b[1;30mtask path: /Users/myuser/repos/mycompany-start/minitest.yml:36\x1b[0m\r\n\x1b[1;31mfatal: [lk6t-monagent-001]: UNREACHABLE! => {\x1b[0m\r\n\x1b[1;31m    "changed": false,\x1b[0m\r\n\x1b[1;31m    "msg": "Data could not be sent to remote host \\"10.166.50.21\\". Make sure this host can be reached over ssh: OpenSSH_8.7p1, OpenSSL 3.0.7 1 Nov 2022\\r\\ndebug1: Reading configuration data /home/runner/.ssh/config\\r\\ndebug1: /home/runner/.ssh/config line 17: Ignored unknown option \\"usekeychain\\"\\r\\ndebug1: /home/runner/.ssh/config line 191: Applying options for 10.166.*\\r\\ndebug2: resolve_canonicalize: hostname 10.166.50.21 is address\\r\\ndebug3: expanded UserKnownHostsFile \'~/.ssh/known_hosts\' -> \'/root/.ssh/known_hosts\'\\r\\ndebug3: expanded UserKnownHostsFile \'~/.ssh/known_hosts2\' -> \'/root/.ssh/known_hosts2\'\\r\\ndebug1: Executing proxy command: exec ssh myuser@lk6-offband-jump.mycompany.com nc 10.166.50.21 22\\r\\ndebug3: timeout: 30000 ms remain after connect\\r\\ndebug1: identity file /root/.ssh/id_rsa type 0\\r\\ndebug1: identity file /root/.ssh/id_rsa-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_dsa type -1\\r\\ndebug1: identity file /root/.ssh/id_dsa-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_ecdsa type -1\\r\\ndebug1: identity file /root/.ssh/id_ecdsa-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_ecdsa_sk type -1\\r\\ndebug1: identity file /root/.ssh/id_ecdsa_sk-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_ed25519 type -1\\r\\ndebug1: identity file /root/.ssh/id_ed25519-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_ed25519_sk type -1\\r\\ndebug1: identity file /root/.ssh/id_ed25519_sk-cert type -1\\r\\ndebug1: identity file /root/.ssh/id_xmss type -1\\r\\ndebug1: identity file /root/.ssh/id_xmss-cert type -1\\r\\ndebug1: Local version string SSH-2.0-OpenSSH_8.7\\r\\nUnauthorised access is prohibited. Usage may be monitored.\\nConnection timed out during banner exchange\\r\\nConnection to UNKNOWN port 65535 timed out\\r\\n",\x1b[0m\r\n\x1b[1;31m    "unreachable": true\x1b[0m\r\n\x1b[1;31m}\x1b[0m', 'start_line': 375, 'end_line': 383, 'runner_ident': '1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9', 'event': 'runner_on_unreachable', 'pid': 31, 'created': '2023-10-06T08:08:23.079854', 'parent_uuid': '0242ac11-0002-eed9-8f62-000000000034', 'event_data': {'playbook': '/Users/myuser/repos/mycompany-start/minitest.yml', 'playbook_uuid': '1e17ea16-0577-4930-9e16-61337a24d5ad', 'play': 'Testing navigator remotely', 'play_uuid': '0242ac11-0002-eed9-8f62-000000000014', 'play_pattern': 'lk6t-monagent-001', 'task': 'Gathering Facts', 'task_uuid': '0242ac11-0002-eed9-8f62-000000000034', 'task_action': 'gather_facts', 'resolved_action': 'ansible.builtin.gather_facts', 'task_args': '', 'task_path': '/Users/myuser/repos/mycompany-start/minitest.yml:36', 'host': 'lk6t-monagent-001', 'remote_addr': 'lk6t-monagent-001', 'start': '2023-10-06T08:07:52.878041', 'end': '2023-10-06T08:08:23.078803', 'duration': 30.200762, 'res': {'unreachable': True, 'msg': 'Data could not be sent to remote host "10.166.50.21". Make sure this host can be reached over ssh: OpenSSH_8.7p1, OpenSSL 3.0.7 1 Nov 2022\r\ndebug1: Reading configuration data /home/runner/.ssh/config\r\ndebug1: /home/runner/.ssh/config line 17: Ignored unknown option "usekeychain"\r\ndebug1: /home/runner/.ssh/config line 191: Applying options for 10.166.*\r\ndebug2: resolve_canonicalize: hostname 10.166.50.21 is address\r\ndebug3: expanded UserKnownHostsFile \'~/.ssh/known_hosts\' -> \'/root/.ssh/known_hosts\'\r\ndebug3: expanded UserKnownHostsFile \'~/.ssh/known_hosts2\' -> \'/root/.ssh/known_hosts2\'\r\ndebug1: Executing proxy command: exec ssh myuser@lk6-offband-jump.mycompany.com nc 10.166.50.21 22\r\ndebug3: timeout: 30000 ms remain after connect\r\ndebug1: identity file /root/.ssh/id_rsa type 0\r\ndebug1: identity file /root/.ssh/id_rsa-cert type -1\r\ndebug1: identity file /root/.ssh/id_dsa type -1\r\ndebug1: identity file /root/.ssh/id_dsa-cert type -1\r\ndebug1: identity file /root/.ssh/id_ecdsa type -1\r\ndebug1: identity file /root/.ssh/id_ecdsa-cert type -1\r\ndebug1: identity file /root/.ssh/id_ecdsa_sk type -1\r\ndebug1: identity file /root/.ssh/id_ecdsa_sk-cert type -1\r\ndebug1: identity file /root/.ssh/id_ed25519 type -1\r\ndebug1: identity file /root/.ssh/id_ed25519-cert type -1\r\ndebug1: identity file /root/.ssh/id_ed25519_sk type -1\r\ndebug1: identity file /root/.ssh/id_ed25519_sk-cert type -1\r\ndebug1: identity file /root/.ssh/id_xmss type -1\r\ndebug1: identity file /root/.ssh/id_xmss-cert type -1\r\ndebug1: Local version string SSH-2.0-OpenSSH_8.7\r\nUnauthorised access is prohibited. Usage may be monitored.\nConnection timed out during banner exchange\r\nConnection to UNKNOWN port 65535 timed out\r\n', 'changed': False}, 'uuid': 'ff9ac28d-2aab-4b58-b6aa-36361af715e3'}}

2023-10-06T11:08:23.142062+03:00 CRITICAL 'ansible_navigator.actions.run_ccde._handle_message' Unhandled message from runner queue, discarded: {'event': 'verbose', 'uuid': '304a8080-8c0e-40ea-8bac-fb7c40a61cb4', 'counter': 288, 'stdout': "Enter passphrase for key '/root/.ssh/id_rsa': ", 'start_line': 374, 'end_line': 375, 'runner_ident': '1c7429d5-cc8c-4b65-8bee-33c9f38a9bc9', 'created': '2023-10-06T08:08:23.054001'}
STEPS TO REPRODUCE

Use an image for execution environment and your ssh key with a passphrase.

EXPECTED RESULTS

As described in docs: https://github.com/ansible/ansible-navigator/blob/main/docs/faq.md#ssh-keys

The use of ssh-agent results in the simplest configuration and eliminates issues with SSH key passphrases when using ansible-navigator with execution environments.

ACTUAL RESULTS

ssh to remote hosts fails

ADDITIONAL INFORMATION

my ansible.cfg in the project dir:

[defaults]
user = myuser
verbosity = 4
collections_paths = ./galaxy
roles_path = ./galaxy:~/repos/galaxy:~/Repositories/galaxy
forks          = 10
transport      = smart
remote_port    = 22
gathering = smart
host_key_checking = False
{... trimmed unrelated options... }
[ssh_connection]
pipelining = True
[persistent_connection]
ssh_type = libssh
connect_timeout = 120
command_timeout = 60
[diff]
always = yes

part of my .ssh/config

ForwardAgent yes
IgnoreUnknown UseKeychain
ServerAliveInterval 10
TCPKeepalive yes
UseKeychain yes
AddKeysToAgent yes
StrictHostKeyChecking no
User myuser
ekartsonakis commented 11 months ago

After a big web detour, it seems to be a MacOS+Docker thing (e.i https://github.com/docker/for-mac/issues/410 ). Running manually a docker container by combining --user=root & -v /run/host-services/ssh-auth.sock:/run/host-services/ssh-auth.sock -e SSH_AUTH_SOCK="/run/host-services/ssh-auth.sock" and removing root .ssh dir mapping -v /Users/myuser/.ssh/:/root/.ssh/, I managed to successfully ssh remotely using my local ssh-agent and my unlocked ssh-key. Here is the simplified command I used:

docker run -it --user=root -v /run/host-services/ssh-auth.sock:/run/host-services/ssh-auth.sock -e SSH_AUTH_SOCK="/run/host-services/ssh-auth.sock" --rm dockerhub.cisco.com/placetel-it-docker-local/placetel-thick-ee:1.1.1 /bin/bash

Ansible-navigater is adding --user=current_user_id by default and if we add container options like this, it is appended. Lastly I'm not sure if SSH_AUTH_SOCK contains a usable socket for mac to be mapped.

Adding the "mac" keyword on the title of this issue.

ssbarnea commented 11 months ago

@ajinkyau @cidrblock Can you please investigate this and determine if that is indeed a bug in navigator or just an environment specific issue related to docker on macos. Probably we would also have a similar issue with podman because its also works the same way (container host being VM).

mjnks commented 4 months ago

+1

SilentRhetoric commented 2 months ago

Also running into this issue and would like to be able to use the EE approach with MacOS + Docker.

TheSimonxy commented 1 month ago

We are also affected, we would like to use ansible-navigator but because of some Mac Users in the Team this is not possible due to this issue.

jambino commented 1 month ago

Probably we would also have a similar issue with podman because its also works the same way (container host being VM).

I did notice an improvement when using podman instead of docker.

If podman is an option for you, try switching from docker to podman and setting container-engine: podman.

brew install podman
podman machine init
podman machine start
podman login internal.oci.registry
podman pull internal.oci.registry/my-ee:0.0.1
ansible-navigator run my-playbook -i inventory/hosts --mode stdout