ansible-collections / community.kubernetes

Kubernetes Collection for Ansible
https://galaxy.ansible.com/community/kubernetes
GNU General Public License v3.0
265 stars 106 forks source link

Ansible based operator on openshift 4.3 fails with error "urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version" #283

Closed rahulagrawalpsl closed 3 years ago

rahulagrawalpsl commented 3 years ago
SUMMARY

We have suddenly started facing this error while creating instance with our ansible based operator. There hasn't been any code change for about a month though. The error is coming from ansbile role when its trying to check if an application pod already exists on the cluster. It was working perfectly file until about 10 days back and nothing has been changed in the operator code for a while.

ISSUE TYPE
COMPONENT NAME

k8s_info

ANSIBLE VERSION
quay.io/operator-framework/ansible-operator:v0.15.0
OS / ENVIRONMENT

Openshift container platform 4.3

STEPS TO REPRODUCE
- name: check if admin-dash pod already exists
  vars:
    appName: admin-dash
  k8s_info:
    kind: Pod
    namespace: "{{ meta.namespace }}"
    label_selectors: "app={{ appName }}"
  register: admindash_status

- name: set isReconcile=true
  set_fact:
     isReconcile: true
  when: admindash_status.resources[0].status.phase is defined and admindash_status.resources[0].status.phase == 'Running'
- name: set isReconcile=false
  set_fact:
     isReconcile: false
  when: admindash_status.resources[0].status.phase is not defined or admindash_status.resources[0].status.phase != 'Running'
- name: debug
  debug:
    msg: "isReconcile = {{ isReconcile }}"
EXPECTED RESULTS

Expected ansible to get the pod status correctly from the cluster.

ACTUAL RESULTS

The error was: urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3bfc02b0b8>: Failed to establish a new connection: [Errno 111] Connection refused',)) fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.6/site-packages/urllib3/connection.py\", line 160, in _new_conn\n (self._dns_host, self.port), self.timeout, **extra_kw\n File \"/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py\", line 84, in create_connection\n raise err\n File \"/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py\", line 74, in create_connection\n sock.connect(sa)\nConnectionRefusedError: [Errno 111] Connection refused\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py\", line 677, in urlopen\n

geerlingguy commented 3 years ago

If you haven't changed the operator code or built a new version of the operator in a while, I doubt the issue has anything to do with the collection.

It looks like something in the cluster may have changed, as the Kubernetes API seems to be continually giving back "Connection refused".

jaydesl commented 3 years ago

See #273 and the PR that resolves it #276

In the meantime, this can be resolved in the playbook by pinning the kubernetes-client version (kubernetes<12.0.0)

rahulagrawalpsl commented 3 years ago

@geerlingguy @jaydesl thanks for the response.

I have a little doubt here. When i see the kubectl version on openshift , i don't see any version details. I was expecting to see it as 12.0.0 as mentioned in the PR you shared. Did i get it wrong ?

# kubectl version --client
Client Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"$Format:%H$", GitTreeState:"", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

# oc version
Client Version: 4.3.13
Server Version: 4.5.13
Kubernetes Version: v1.18.3+47c0e71

Also, could you please advice how do i configure kubernetes-client version in ansible playbook/role?

Akasurde commented 3 years ago

@rahulagrawalpsl The version kubernetes<12.0.0 @jaydesl saying is for the Kubernetes Python Client library. The above output is for kubectl and oc command. Please do pip list | grep kubernetes.

Mine is -

# pip list | grep kuber
kubernetes                         11.0.0
rahulagrawalpsl commented 3 years ago

thanks @Akasurde . I don't have pip on my machine though. let me install it and check.

btw, could you please advise how to define kubernetes-client version in ansible playbook/role?

Akasurde commented 3 years ago

@rahulagrawalpsl Kuberentes collection uses kubernetes-client directly and you can not define the version in the playbook. You may need to check pip on the machine where you are using Ansible (localhost in your case).

rahulagrawalpsl commented 3 years ago

@Akasurde Thanks. Since here ansible is being used by operator, i am wondering if this issue is on openshift cluster or on the operator. In other works, do we need to check kubernetes client on openshift cluster itself or within the operator container.

Our operator image is using this ansible operator image as base -

quay.io/operator-framework/ansible-operator:v0.15.0

Could it be the version issue in ansible operator 0.15.0 ?

rahulagrawalpsl commented 3 years ago

I don't actually see a kubernetes version when i run "pip list". I see 12.0.0 when i run the same in operator container.

rahulagrawalpsl commented 3 years ago

1) Is it okay to use 11.0.0 version , is there any drawback of using lower version ? 2) How would we know when the issue is fixed with latest version of kubernetes, so that we can use the latest in our image ? 3) What's the best way to find out what version of kubernetes is being used in any version of ansible operator ? ( here is the one we are using currently - quay.io/operator-framework/ansible-operator:v0.15.0)

Akasurde commented 3 years ago
1. Is it okay to use 11.0.0 version , is there any drawback of using lower version ?

I think yes. There is no drawback of using a lower version as such. Not that I am aware of.

2. How would we know when the issue is fixed with latest version of kubernetes, so that we can use the latest in our image ?

I think this change https://github.com/kubernetes-client/python/commit/b4d11b02a3479e63957a729614a616002f13e9c4#diff-59aff6ce4d28aa662f8b411b9d0dfe4f3e949c32a5edaf8e08905b58e7a41ee3L69-R71 caused the regression. This is being tracked under https://github.com/kubernetes-client/python/issues/1284. So I think once this is fixed we will be good to use v12.0.0. But I would suggest to stay with version 11 or stable version.

3. What's the best way to find out what version of kubernetes is being used in any version of ansible operator ?
   ( here is the one we are using currently - quay.io/operator-framework/ansible-operator:v0.15.0)

I am not the correct guy to answer this.

tima commented 3 years ago
  1. What's the best way to find out what version of kubernetes is being used in any version of ansible operator? ( here is the one we are using currently - quay.io/operator-framework/ansible-operator:v0.15.0)

Like 2, this is an independent project from this one that we are only dependent on. You'll have to go over to the oeperrator-sdk project where this is developed and managed: https://github.com/operator-framework/operator-sdk