ansible-collections / community.general

Ansible Community General Collection
https://galaxy.ansible.com/ui/repo/published/community/general/
GNU General Public License v3.0
820 stars 1.5k forks source link

etcd3 lookup provider cannot connect to HTTPS etcd3 endpoint #1664

Open eramnes opened 3 years ago

eramnes commented 3 years ago
SUMMARY

When using an etcd3 cluster that is configured to use HTTPS, the etcd3 lookup provider appears to be unable to connect to it. Specifying an endpoint of "https://<etcd3_host>:<port>" seems to strip the "https://" from the connection string, and using a host of "https://<etcd3_host>" seems to try to perform a DNS lookup that includes the "https://".

ISSUE TYPE
COMPONENT NAME

etcd3 lookup provider

ANSIBLE VERSION
ansible 2.9.16
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/var/go/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.6/site-packages/ansible
  executable location = /bin/ansible
  python version = 3.6.8 (default, Aug 18 2020, 08:33:21) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]
CONFIGURATION
No output was returned from ansible-config dump --only-changed.
OS / ENVIRONMENT

Operating System: Red Hat Enterprise Linux 8.3 (Ootpa) CPE OS Name: cpe:/o:redhat:enterprise_linux:8.3:GA Kernel: Linux 4.18.0-240.10.1.el8_3.x86_64 Architecture: x86-64

$ pip3 show etcd3 Name: etcd3 Version: 0.12.0

$ pip3 show grpcio Name: grpcio Version: 1.35.0

I have verified that the machine Ansible runs on is able to successfully connect to the etcd3 cluster:

$ curl https://<etcd3_host>:2379/v3
<a href="/v3/">Moved Permanently</a>.

The etcd3 cluster appears to be healthy and listening on HTTPS:

# etcdctl --user root member list --endpoints=https://<etcd3_host>:2379
Password: 
8e9e05c52164694d, started, 1717328e83e54c57b25e9fcaf348cc9a, https://0.0.0.0:2380, https://0.0.0.0:2379, false
STEPS TO REPRODUCE
#etcd_endpoints: https://<etcd3_host>:2379
etcd_host: https://<etcd3_host>
etcd_port: 2379
etcd_user: ansibleetcd
etcd_password: !vault |
          $ANSIBLE_VAULT;1.1;AES256
           123456789...

ad_fqdn: "ad.example.com"
vm_name: "example"

# is_ad_joined: "{{ lookup('community.general.etcd3', '/'+ad_fqdn+'/'+vm_name, endpoints=etcd_endpoints, user=etcd_user, password=etcd_password) }}"
is_ad_joined: "{{ lookup('community.general.etcd3', '/'+ad_fqdn+'/'+vm_name, host=etcd_host, port=etcd_port, user=etcd_user, password=etcd_password) }}"
EXPECTED RESULTS

The etcd3 lookup completes successfully, and returns the value assigned to the requested key

ACTUAL RESULTS

When run with the "endpoint" variable set:

ansible-playbook 2.9.16
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/var/go/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.6/site-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.6.8 (default, Aug 18 2020, 08:33:21) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]
Using /etc/ansible/ansible.cfg as config file
host_list declined parsing /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/inventory.py as it did not pass its verify_file() method
Parsed /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/inventory.py inventory source with script plugin
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Skipping callback 'actionable', as we already have a stdout callback.
Skipping callback 'counter_enabled', as we already have a stdout callback.
Skipping callback 'debug', as we already have a stdout callback.
Skipping callback 'dense', as we already have a stdout callback.
Skipping callback 'dense', as we already have a stdout callback.
Skipping callback 'full_skip', as we already have a stdout callback.
Skipping callback 'json', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'null', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
Skipping callback 'selective', as we already have a stdout callback.
Skipping callback 'skippy', as we already have a stdout callback.
Skipping callback 'stderr', as we already have a stdout callback.
Skipping callback 'unixy', as we already have a stdout callback.
Skipping callback 'yaml', as we already have a stdout callback.

PLAYBOOK: win-template-ad.yml *********************************************
1 plays in win-template-ad.yml
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'

PLAY [Join template to Active Directory] **********************************
META: ran handlers
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'

TASK [win-template-ad : Wait for system to become reachable] **************
task path: /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/roles/win-template-ad/tasks/main.yml:2
etcd3 connection parameters: {'host': '<etcd3_host>', 'port': '2379', 'timeout': 60, 'user': 'ansibleetcd', 'password': '<redacted>'}
fatal: [<vm_name>]: FAILED! => {
    "msg": "The conditional check 'is_ad_joined != \"yes\"' failed. The error was: An unhandled exception occurred while templating '{{ lookup('community.general.etcd3', '/'+ad_fqdn+'/'+vm_name, endpoints=etcd_endpoints, user=etcd_user, password=etcd_password) }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'community.general.etcd3'. Error was a <class 'ansible.errors.AnsibleLookupError'>, original message: Cannot connect to etcd cluster: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1611413404.759922782\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":5390,\"referenced_errors\":[{\"created\":\"@1611413404.759916232\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc\",\"file_line\":397,\"grpc_status\":14}]}\"\n>\n\nThe error appears to be in '/var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/roles/win-template-ad/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Wait for system to become reachable\n  ^ here\n"

On the etcd3 server, you can see that the lookup tried to connect, but appears to have used HTTP instead of HTTPS:

Jan 23 08:50:04 <etcd3_host> etcd[18814]: rejected connection from "<ansible_host_ip>:50626" (error "tls: first record does not look like a TLS handshake", ServerName "")

When run with the "host" variable set:

ansible-playbook 2.9.16
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/var/go/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.6/site-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.6.8 (default, Aug 18 2020, 08:33:21) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]
Using /etc/ansible/ansible.cfg as config file
host_list declined parsing /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/inventory.py as it did not pass its verify_file() method
Parsed /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/inventory.py inventory source with script plugin
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Skipping callback 'actionable', as we already have a stdout callback.
Skipping callback 'counter_enabled', as we already have a stdout callback.
Skipping callback 'debug', as we already have a stdout callback.
Skipping callback 'dense', as we already have a stdout callback.
Skipping callback 'dense', as we already have a stdout callback.
Skipping callback 'full_skip', as we already have a stdout callback.
Skipping callback 'json', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'null', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
Skipping callback 'selective', as we already have a stdout callback.
Skipping callback 'skippy', as we already have a stdout callback.
Skipping callback 'stderr', as we already have a stdout callback.
Skipping callback 'unixy', as we already have a stdout callback.
Skipping callback 'yaml', as we already have a stdout callback.

PLAYBOOK: win-template-ad.yml *********************************************
1 plays in win-template-ad.yml
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'

PLAY [Join template to Active Directory] **********************************
META: ran handlers
Read vars_file '../group_vars/template-creds.yml'
Read vars_file '../group_vars/etcd-creds.yml'

TASK [win-template-ad : Wait for system to become reachable] **************
task path: /var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/roles/win-template-ad/tasks/main.yml:2
etcd3 connection parameters: {'host': 'https://<etcd3_host>', 'port': 2379, 'timeout': 60, 'user': 'ansibleetcd', 'password': '<redacted>'}
fatal: [vm_name]: FAILED! => {
    "msg": "The conditional check 'is_ad_joined != \"yes\"' failed. The error was: An unhandled exception occurred while templating '{{ lookup('community.general.etcd3', '/'+ad_fqdn+'/'+vm_name, host=etcd_host, port=etcd_port, user=etcd_user, password=etcd_password) }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'community.general.etcd3'. Error was a <class 'ansible.errors.AnsibleLookupError'>, original message: Cannot connect to etcd cluster: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"DNS resolution failed for service: https://<etcd3_host>:2379\"\n\tdebug_error_string = \"{\"created\":\"@1611413677.532513096\",\"description\":\"Resolver transient failure\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":2140,\"referenced_errors\":[{\"created\":\"@1611413677.532511012\",\"description\":\"DNS resolution failed for service: https://<etcd3_host>:2379\",\"file\":\"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc\",\"file_line\":370,\"grpc_status\":14,\"referenced_errors\":[{\"created\":\"@1611413677.532485854\",\"description\":\"C-ares status is not ARES_SUCCESS qtype=A name=https://<etcd3_host>:2379 is_balancer=0: Domain name not found\",\"file\":\"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc\",\"file_line\":728}]}]}\"\n>\n\nThe error appears to be in '/var/lib/go-agent/pipelines/win-template-bootstrap/playbooks/roles/win-template-ad/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Wait for system to become reachable\n  ^ here\n"

It appears to me from the output that it's attempting the DNS lookup for the host, but including the "https://" instead of just looking at the actual FQDN.

I had to try to sanitize some of the playbook names. I think I got them all, but if anything looks inconsistent in the workflow it's probably my fault.

Thanks!

ansibullbot commented 3 years ago

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

felixfontein commented 3 years ago

!component =plugins/lookup/etcd3.py

ansibullbot commented 3 years ago

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

Ajpantuso commented 3 years ago

This is the expected behavior of the etcd3 library this plugin uses. (See here) In order for the client to use a HTTPS connection ca_cert must be provided with cert_cert and cert_key optionally included. Maybe just a doc update is appropriate for this issue.

ansibullbot commented 3 years ago

cc @eric-belhomme click here for bot help