ansible-collections / community.kubernetes

Kubernetes Collection for Ansible
https://galaxy.ansible.com/community/kubernetes
GNU General Public License v3.0
265 stars 106 forks source link

Ansible fails to connect to Kubernetes cluster #365

Closed foroozf001 closed 3 years ago

foroozf001 commented 3 years ago
SUMMARY

Running an Ansible playbook against my cluster results in a failed connection error.

ISSUE TYPE
COMPONENT NAME
CONFIGURATION
ffo@ffo-ThinkPad-T490:~/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible$ ansible-config dump --only-changed
COLOR_VERBOSE(/home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/ansible.cfg) = cyan
OS / ENVIRONMENT
STEPS TO REPRODUCE

I can not establish any Ansible connection with my cluster so all use-cases fail.

---
- hosts: local
  vars:
    #working_dir_path: "/home/vsts/work/r1/a/_Nieuwsapp-CI/drop/Terraform" # "/home/vsts/work/1/s/"
    working_dir_path: "~/GitRepos/NC%20Infrastructure/T8Cluster" # Ubuntu18 build-agent mounts to: /home/vsts/work/1/s/
    working_dir_env: "prod"
    dns_label_name: "team8-prod-label"
    load_balancer_resource_group: "nc-team8-prod-rg"
    managed_identity_name: "nc-team8-prod-identity"
    loadbalancer_ip: 
    environment_1: "prod"
    # environment_2: "acc"
  collections:
    - community.kubernetes
  roles:
    - k8s
  environment:
    KUBECONFIG: /home/ffo/.kube/config

Hosts-file

[local]
127.0.0.1

[local:vars]
ansible_connection=local
ansible_python_interpreter=/usr/bin/python3
EXPECTED RESULTS

I have used Ansible with community.kubernetes before without fail. I expect to connect to my cluster.

ACTUAL RESULTS
ffo@ffo-ThinkPad-T490:~/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible$ sudo ansible-playbook playbook.main.yaml -i hosts -vvv
ansible-playbook 2.10.5
  config file = /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.8/dist-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.8.5 (default, Jul 28 2020, 12:59:40) [GCC 9.3.0]
Using /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/ansible.cfg as config file
host_list declined parsing /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/hosts as it did not pass its verify_file() method
script declined parsing /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/hosts as it did not pass its verify_file() method
auto declined parsing /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/hosts as it did not pass its verify_file() method
Parsed /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/hosts inventory source with ini plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.

PLAYBOOK: playbook.main.yaml **********************************************************************************************************
1 plays in playbook.main.yaml

PLAY [local] **************************************************************************************************************************

TASK [Gathering Facts] ****************************************************************************************************************
task path: /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/playbook.main.yaml:2
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287 `" && echo ansible-tmp-1612900940.2483833-15613-50505908905287="` echo /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287 `" ) && sleep 0'
Using module file /usr/local/lib/python3.8/dist-packages/ansible/modules/setup.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-1560990_r7hk1/tmp_ere8giq TO /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287/AnsiballZ_setup.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287/ /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287/AnsiballZ_setup.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'KUBECONFIG=/home/ffo/.kube/config /usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287/AnsiballZ_setup.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1612900940.2483833-15613-50505908905287/ > /dev/null 2>&1 && sleep 0'
ok: [127.0.0.1]
META: ran handlers

TASK [k8s : Create a k8s namespace] ***************************************************************************************************
task path: /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/roles/k8s/tasks/main.yaml:9
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885 `" && echo ansible-tmp-1612900944.1054797-16010-184447342807885="` echo /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885 `" ) && sleep 0'
Using module file /home/ffo/Repositories/NC%20Infrastructure/T8Cluster/prod/ansible/collections/ansible_collections/community/kubernetes/plugins/modules/k8s.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-1560990_r7hk1/tmp9boao4_b TO /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885/AnsiballZ_k8s.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885/ /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'KUBECONFIG=/home/ffo/.kube/config /usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1612900944.1054797-16010-184447342807885/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
  File "/tmp/ansible_community.kubernetes.k8s_payload_8sss4pq0/ansible_community.kubernetes.k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py", line 265, in get_api_client
    return DynamicClient(kubernetes.client.ApiClient(configuration))
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 71, in __init__
    self.__discoverer = discoverer(self, cache_file)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 259, in __init__
    Discoverer.__init__(self, client, cache_file)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 31, in __init__
    self.__init_cache()
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 78, in __init_cache
    self._load_server_info()
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 158, in _load_server_info
    'kubernetes': self.client.request('get', '/version', serializer=just_json)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 42, in inner
    resp = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 235, in request
    return self.client.call_api(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 373, in request
    return self.rest_client.GET(url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 239, in GET
    return self.request("GET", url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 212, in request
    r = self.pool_manager.request(method, url,
  File "/usr/lib/python3/dist-packages/urllib3/request.py", line 75, in request
    return self.request_encode_url(
  File "/usr/lib/python3/dist-packages/urllib3/request.py", line 97, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/usr/lib/python3/dist-packages/urllib3/poolmanager.py", line 330, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
fatal: [127.0.0.1]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "force": false,
            "host": null,
            "kind": "Namespace",
            "kubeconfig": null,
            "merge_type": null,
            "name": "testing",
            "namespace": null,
            "password": null,
            "persist_config": null,
            "proxy": null,
            "resource_definition": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": null,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9df4257b20>: Failed to establish a new connection: [Errno 111] Connection refused'))"
}
tima commented 3 years ago

@foroozf001 Can you verify what version of the kubernetes python library you are now using there? This sounds like #314 where kubernetes 12.0 causes previous.y working code to break.

foroozf001 commented 3 years ago

@tima I have tried both 11.0.0 and 12.0.1.

The problem I had with 11.0.0 was my RBAC. I had to create a clusterrole binding like so:

kubectl create clusterrolebinding serviceaccounts-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts

This is the stack trace for 12.0.1:

The full traceback is:
  File "/tmp/ansible_community.kubernetes.k8s_payload__r_yk6ni/ansible_community.kubernetes.k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py", line 265, in get_api_client
    return DynamicClient(kubernetes.client.ApiClient(configuration))
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 71, in __init__
    self.__discoverer = discoverer(self, cache_file)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 259, in __init__
    Discoverer.__init__(self, client, cache_file)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 31, in __init__
    self.__init_cache()
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 78, in __init_cache
    self._load_server_info()
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/discovery.py", line 158, in _load_server_info
    'kubernetes': self.client.request('get', '/version', serializer=just_json)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 42, in inner
    resp = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/openshift/dynamic/client.py", line 235, in request
    return self.client.call_api(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 373, in request
    return self.rest_client.GET(url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 239, in GET
    return self.request("GET", url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 212, in request
    r = self.pool_manager.request(method, url,
  File "/usr/lib/python3/dist-packages/urllib3/request.py", line 75, in request
    return self.request_encode_url(
  File "/usr/lib/python3/dist-packages/urllib3/request.py", line 97, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/usr/lib/python3/dist-packages/urllib3/poolmanager.py", line 330, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 747, in urlopen
    return self.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
fatal: [127.0.0.1]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "force": false,
            "host": null,
            "kind": "Namespace",
            "kubeconfig": null,
            "merge_type": null,
            "name": "testing",
            "namespace": null,
            "password": null,
            "persist_config": null,
            "proxy": null,
            "resource_definition": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": null,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd75aca4b20>: Failed to establish a new connection: [Errno 111] Connection refused'))"
}
tima commented 3 years ago

@foroozf001: OK so maybe not #314. Thought we could take a shortcut figuring out the root of your problem.

We will need more info to try and reproduce this.

Akasurde commented 3 years ago

I am not able to reproduce this with -

pip list | egrep 'kubernetes|openshift|ansible'
ansible             2.10.7
ansible-base        2.10.5
kubernetes          12.0.1
openshift           0.11.2
Akasurde commented 3 years ago

You are using community.kubernetes 1.1.0 which reproduces this behavior

# ansible-galaxy collection list

# /home/akasurde/collections/ansible_collections
Collection           Version
-------------------- -------
community.kubernetes 1.1.0
TASK [k8s_info] *******************************************************************************
task path: /tmp/test_k8s/k8s_info.yml:4
redirecting (type: modules) ansible.builtin.k8s_info to community.kubernetes.k8s_info
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10ff0cd60>: Failed to establish a new connection: [Errno 61] Connection refused'))"}

Please remove the older community.kubernetes collection and upgrade to latest i.e. 1.1.1.

foroozf001 commented 3 years ago

@foroozf001: OK so maybe not #314. Thought we could take a shortcut figuring out the root of your problem.

We will need more info to try and reproduce this.

  • You said "I have used Ansible with community.kubernetes before without fail. I expect to connect to my cluster." When did it start failing? What changed? Did you upgrade the collection or ansible itself?

I upgraded from Ansible 2.9 to 2.10. I have upgraded the collection as well.

  • What version of kubernetes.core/community.kubernetes you are running? You can use ansible-galaxy collection list in the version you are running.
ffo@ffo-ThinkPad-T490:~$ ansible-galaxy collection list | grep kubernetes
community.kubernetes      1.1.1  
  • Is the example playbook under "Steps To Reproduce" the same as playbook.main.yaml? The example output doesn't seem to line up.

Yes it is the same playbook.

  • What is in the k8s role? What is the complete task declaration that causes the error?

The K8S role is simply this.

- name: Create a k8s namespace
  community.kubernetes.k8s:
    name: testing
    api_version: v1
    kind: Namespace
    state: present 
Akasurde commented 3 years ago

@foroozf001 Are you still facing issue after upgrading?