ansible-collections / community.kubernetes

Kubernetes Collection for Ansible
https://galaxy.ansible.com/community/kubernetes
GNU General Public License v3.0
265 stars 106 forks source link

Difficulty specifying correct kubeconfig #314

Closed kwoodson closed 3 years ago

kwoodson commented 3 years ago
SUMMARY

I'm attempting to run a few k8s commands. For example,

  - name: fetch etcd credentials
    k8s:
      kind: secret
      api_version: v1
      namespace: openshift-etcd
      name: etcd-all-peer
    register: peer_secret

I'm receiving the following error:

The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
  File "/tmp/ansible_k8s_payload_HPHDKv/ansible_k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py", line 265, in get_api_client
    return DynamicClient(kubernetes.client.ApiClient(configuration))
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/client.py", line 71, in __init__
    self.__discoverer = discoverer(self, cache_file)
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 259, in __init__
    Discoverer.__init__(self, client, cache_file)
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 31, in __init__
    self.__init_cache()
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 78, in __init_cache
    self._load_server_info()
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 165, in _load_server_info
    self.client.configuration.host)
fatal: [52.176.93.70]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "force": false,
            "host": null,
            "kind": "secret",
            "kubeconfig": null,
            "merge_type": null,
            "name": "etcd-all-peer",
            "namespace": "openshift-etcd",
            "password": null,
            "persist_config": null,
            "proxy": null,
            "resource_definition": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": null,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to get client due to Host value http://localhost should start with https:// when talking to HTTPS endpoint"
}

I have attempted to supply the environment variable K8S_AUTH_KUBECONFIG as well as pass the kubeconfig argument. The K8S_AUTH_KUBECONFIG does not change anything or is being ignored. The kubeconfig fails with an error that it cannot be found. This is a bit puzzling as I have the files required and both my local and remote hosts have a valid kubeconfig file. (Tested with kubectl --kubeconfig /home/kwoodson/.kube/config get pods)

I am able to execute the following commands locally and remotely:

[cloud-user@bastion ~]$ python -c 'import kubernetes;from openshift.dynamic import DynamicClient; print(DynamicClient(kubernetes.config.new_client_from_config(config_file="/home/cloud-user/.kube/config")).version)'
{'kubernetes': {u'major': u'1', u'gitTreeState': u'clean', u'buildDate': u'2020-11-29T08:57:48Z', u'platform': u'linux/amd64', u'minor': u'19', u'gitCommit': u'1348ff864868a6202addf58c3ee5e6ebf8fae77e', u'compiler': u'gc', u'gitVersion': u'v1.19.0+1348ff8', u'goVersion': u'go1.15.2'}}

I'm a bit confused whether this executes on the remote host or this executes locally. The error messages lead me to believe the execution is occurring remotely. If this is the case, the kubeconfig exists, is valid, can be found via the stat file task. I'm not sure why I would be seeing an empty client.

ISSUE TYPE
COMPONENT NAME

ansible k8s module

ANSIBLE VERSION
ansible 2.9.14
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/kwoodson/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.8/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.8.6 (default, Sep 25 2020, 00:00:00) [GCC 10.2.1 20200723 (Red Hat 10.2.1-1)]
CONFIGURATION
ansible-config dump --only-changed
ANSIBLE_PIPELINING(/etc/ansible/ansible.cfg) = False
OS / ENVIRONMENT

Fedora 32 (local)

head -n 2 /usr/bin/ansible-playbook 
#!/usr/bin/python3 
/usr/bin/python3 --version
Python 3.8.6
pip list | grep openshift
openshift                      0.11.2    
pip list | grep kubernetes
kubernetes                     11.0.0    

Rhel 7.9 (remote host) Python 2.7.5

STEPS TO REPRODUCE

Run the following playbook

---
- hosts: bastion
  user: cloud-user
  gather_facts: yes
  collections:
  - community.kubernetes.k8s
  tasks:
  - name: test
    k8s:
      api_version: v1
      kind: secret
      name: etcd-all-peer
      namespace: openshift-etcd
      verify_ssl: no
EXPECTED RESULTS

I would expect that the k8s query would be successful.

ACTUAL RESULTS
<52.176.93.70> (1, b'\r\n{"msg": "Failed to get client due to Host value http://localhost should start with https:// when talking to HTTPS endpoint", "failed": true, "exception": "WARNING: The below traceback may *not* be related to the actual failure.\\n  File \\"/tmp/ansible_k8s_payload_wRarGY/ansible_k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py\\", line 265, in get_api_client\\n    return DynamicClient(kubernetes.client.ApiClient(configuration))\\n  File \\"/usr/lib/python2.7/site-packages/openshift/dynamic/client.py\\", line 71, in __init__\\n    self.__discoverer = discoverer(self, cache_file)\\n  File \\"/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py\\", line 259, in __init__\\n    Discoverer.__init__(self, client, cache_file)\\n  File \\"/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py\\", line 31, in __init__\\n    self.__init_cache()\\n  File \\"/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py\\", line 78, in __init_cache\\n    self._load_server_info()\\n  File \\"/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py\\", line 165, in _load_server_info\\n    self.client.configuration.host)\\n", "invocation": {"module_args": {"force": false, "wait_sleep": 5, "verify_ssl": false, "apply": false, "client_key": null, "password": null, "namespace": "openshift-etcd", "resource_definition": null, "state": "present", "template": null, "api_key": null, "client_cert": null, "api_version": "v1", "username": null, "ca_cert": null, "merge_type": null, "wait_condition": null, "host": null, "wait_timeout": 120, "proxy": null, "validate": null, "persist_config": null, "wait": false, "append_hash": false, "src": null, "kind": "secret", "name": "etcd-all-peer", "kubeconfig": null, "context": null, "validate_certs": false}}}\r\n', b'Shared connection to 52.176.93.70 closed.\r\n')
<52.176.93.70> Failed to connect to the host via ssh: Shared connection to 52.176.93.70 closed.
<52.176.93.70> ESTABLISH SSH CONNECTION FOR USER: cloud-user
<52.176.93.70> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="cloud-user"' -o ConnectTimeout=10 -o ControlPath=/home/kwoodson/.ansible/cp/8014d22535 52.176.93.70 '/bin/sh -c '"'"'rm -f -r /home/cloud-user/.ansible/tmp/ansible-tmp-1606847726.3368785-70578-269377171109996/ > /dev/null 2>&1 && sleep 0'"'"''
<52.176.93.70> (0, b'', b'')
The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
  File "/tmp/ansible_k8s_payload_wRarGY/ansible_k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py", line 265, in get_api_client
    return DynamicClient(kubernetes.client.ApiClient(configuration))
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/client.py", line 71, in __init__
    self.__discoverer = discoverer(self, cache_file)
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 259, in __init__
    Discoverer.__init__(self, client, cache_file)
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 31, in __init__
    self.__init_cache()
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 78, in __init_cache
    self._load_server_info()
  File "/usr/lib/python2.7/site-packages/openshift/dynamic/discovery.py", line 165, in _load_server_info
    self.client.configuration.host)
fatal: [52.176.93.70]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "force": false,
            "host": null,
            "kind": "secret",
            "kubeconfig": null,
            "merge_type": null,
            "name": "etcd-all-peer",
            "namespace": "openshift-etcd",
            "password": null,
            "persist_config": null,
            "proxy": null,
            "resource_definition": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": false,
            "verify_ssl": false,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to get client due to Host value http://localhost should start with https:// when talking to HTTPS endpoint"
}
tima commented 3 years ago

@kwoodson Have you upgraded the community.kubernetes then what shipped with those ansible distros or maybe installed the kubernetes.core collection? I'm just verifying which version you are running with here.

stanislaw55 commented 3 years ago

I think I know what's the problem here: currently unreleased version of community.kubernetes has a fix to work with python's kubernetes library of version 12 and up (here #276). The last released version of community.kubernetes (1.1.1) does not contain this fix hence it breaks when used with Python's kubernetes library of version 12 and up. The error looks exactly as one I've encountered when using this buggy combination. The only solution I came up with was to stick to kubernetes version 11.0.0 and use currently released community.kubernetes.

@resmo I've seen that version of Python's kubernetes on your local machine is 11.0.0. I suspect that version of this library on remote machine is 12.0.0.

The actual solution would be to release new version of community.kubernetes ASAP because it contains relevant bugfix.

kwoodson commented 3 years ago

@stanislaw55 Your assumption might be correct. Here are the versions:

[root@bastion ~]# rpm -qa | grep python2-kubernetes
python2-kubernetes-12.0.1-1.el7.noarch
[root@bastion ~]# rpm -qa | grep openshift
python2-openshift-0.11.2-1.el7.noarch
kwoodson commented 3 years ago

@stanislaw55 I was able to find a copy of the older rpm:

[root@bastion cloud-user]# rpm -qa | grep python2-kubernetes
python2-kubernetes-11.0.0-1.el7.noarch

This solved my problem.

goneri commented 3 years ago

@kwoodson can you confirm if the current git version of community.kubernetes works for you with 12.0.0?

ClementGautier commented 3 years ago

I confirm that python2-kubernetes-12.0.1-1.el7.noarch is causing the problem (I just tested on a server that had python2-kubernetes-11.0.0-2.el7.noarch, it worked, then I updated that precise package and reproduced this error).

kwoodson commented 3 years ago

@goneri if there is an easy way to test that the git version works I'll be happy to test. I installed the collection from git but this still fails because of the underlying python2-kubernetes-12.0.1 is causing the issue not the collection itself. When removing the rpm it is missing and collection doesn't provide it AFAIK.

goneri commented 3 years ago

@kwoodson, You can install the collection from git this way:

rm -r ~/.ansible/collections/ansible_collections/community/kubernetes/
git clone https://github.com/ansible-collections/community.kubernetes ~/.ansible/collections/ansible_collections/community/kubernetes/

You will still need python2-kubernetes-12.0.1.

kwoodson commented 3 years ago

@goneri

[root@bastion community.kubernetes]# rpm -qa | grep openshift
python2-openshift-0.11.2-1.el7.noarch
[root@bastion community.kubernetes]# rpm -qa | grep kubernetes
python2-kubernetes-12.0.1-1.el7.noarch
[root@bastion community.kubernetes]# ll ~/.ansible/collections/ansible_collections/community/kubernetes/
total 92
-rw-r--r--.  1 root root    36 Dec  2 18:09 bindep.txt
-rw-r--r--.  1 root root 12518 Dec  2 18:09 CHANGELOG.rst
drwxr-xr-x.  3 root root    64 Dec  2 18:09 changelogs
-rw-r--r--.  1 root root   107 Dec  2 18:09 codecov.yml
-rw-r--r--.  1 root root  3278 Dec  2 18:09 CONTRIBUTING.md
-rw-r--r--.  1 root root   857 Dec  2 18:09 galaxy.yml
...

Still receiving this error with the latest collection:

TASK [fetch etcd credentials] *******************************************************************************************************************
fatal: [52.165.158.15]: FAILED! => {"changed": false, "msg": "Failed to get client due to Host value http://localhost should start with https:// when talking to HTTPS endpoint"}
stanislaw55 commented 3 years ago

@kwoodson So just to sum up:

Muscule commented 3 years ago

I confirm that python2-kubernetes-12.0.1-1.el7.noarch is causing the problem (I just tested on a server that had python2-kubernetes-11.0.0-2.el7.noarch, it worked, then I updated that precise package and reproduced this error).

Akasurde commented 3 years ago

https://github.com/openshift/openshift-restclient-python/issues/389

fabianvf commented 3 years ago

I think the modules will need to be updated to be compatible with these changes kubernetes 12.x, by the time the dynamic client is created the configuration has already been loaded

StevenBarre commented 3 years ago

https://github.com/kubernetes-client/python/issues/1333 appears to be related to this

Akasurde commented 3 years ago

Update on 10 Feb 2021 -

Ansible Kubernetes-python Openshift Works
2.9.17 11.0.0 0.11.2 Yes
2.9.17 12.0.1 0.11.2 Yes
2.10.5 11.0.0 0.11.2 Yes
2.10.5 12.0.1 0.11.2 Yes