krkn-chaos / cerberus

Guardian of Kubernetes clusters. Tool to monitor clusters health and signal/alert on failures.
Apache License 2.0
92 stars 41 forks source link

Cerberus not working with private clusters #183

Closed paigerube14 closed 2 years ago

paigerube14 commented 2 years ago

Need to add in a way to specify to the kubernetes python api if proxy settings are set

Fix outlined in issue here: https://github.com/kubernetes-client/python/issues/333

09-27 10:49:49.141  + python start_cerberus.py --config cerberus.yaml
09-27 10:49:49.396  2022-09-27 14:49:49,330 [INFO] Starting ceberus
09-27 10:49:49.396  2022-09-27 14:49:49,394 [INFO] Initializing client to talk to the Kubernetes cluster
09-27 10:49:49.687  2022-09-27 14:49:49,435 [WARNING] Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff577059e50>: Failed to establish a new connection: [Errno -2] Name or service not known')': /api/v1/namespaces?pretty=True&limit=100
09-27 10:49:49.687  2022-09-27 14:49:49,440 [WARNING] Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff577059f40>: Failed to establish a new connection: [Errno -2] Name or service not known')': /api/v1/namespaces?pretty=True&limit=100
09-27 10:49:49.687  2022-09-27 14:49:49,445 [WARNING] Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff5770d0250>: Failed to establish a new connection: [Errno -2] Name or service not known')': /api/v1/namespaces?pretty=True&limit=100
09-27 10:49:49.687                 _                         
09-27 10:49:49.687    ___ ___ _ __| |__   ___ _ __ _   _ ___ 
09-27 10:49:49.687   / __/ _ \ '__| '_ \ / _ \ '__| | | / __|
09-27 10:49:49.687  | (_|  __/ |  | |_) |  __/ |  | |_| \__ \
09-27 10:49:49.687   \___\___|_|  |_.__/ \___|_|   \__,_|___/
09-27 10:49:49.687                                           
09-27 10:49:49.687  
09-27 10:49:49.687  Traceback (most recent call last):
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
09-27 10:49:49.687      conn = connection.create_connection(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
09-27 10:49:49.687      for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
09-27 10:49:49.687    File "/usr/local/lib/python3.9/socket.py", line 953, in getaddrinfo
09-27 10:49:49.687      for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
09-27 10:49:49.687  socket.gaierror: [Errno -2] Name or service not known
09-27 10:49:49.687  
09-27 10:49:49.687  During handling of the above exception, another exception occurred:
09-27 10:49:49.687  
09-27 10:49:49.687  Traceback (most recent call last):
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
09-27 10:49:49.687      httplib_response = self._make_request(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
09-27 10:49:49.687      self._validate_conn(conn)
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
09-27 10:49:49.687      conn.connect()
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connection.py", line 358, in connect
09-27 10:49:49.687      self.sock = conn = self._new_conn()
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
09-27 10:49:49.687      raise NewConnectionError(
09-27 10:49:49.687  urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7ff5770d0370>: Failed to establish a new connection: [Errno -2] Name or service not known
09-27 10:49:49.687  
09-27 10:49:49.687  During handling of the above exception, another exception occurred:
09-27 10:49:49.687  
09-27 10:49:49.687  Traceback (most recent call last):
09-27 10:49:49.687    File "start_cerberus.py", line 563, in <module>
09-27 10:49:49.687      main(options.cfg)
09-27 10:49:49.687    File "start_cerberus.py", line 118, in main
09-27 10:49:49.687      sdn_namespace = kubecli.check_sdn_namespace()
09-27 10:49:49.687    File "cerberus/kubernetes/client.py", line 167, in check_sdn_namespace
09-27 10:49:49.687      namespaces = list_namespaces()
09-27 10:49:49.687    File "cerberus/kubernetes/client.py", line 74, in list_namespaces
09-27 10:49:49.687      ret_overall = list_continue_helper(cli.list_namespace, pretty=True, limit=request_chunk_size)
09-27 10:49:49.687    File "cerberus/kubernetes/client.py", line 36, in list_continue_helper
09-27 10:49:49.687      ret = func(*args, **keyword_args)
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", line 14721, in list_namespace
09-27 10:49:49.687      return self.list_namespace_with_http_info(**kwargs)  # noqa: E501
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", line 14828, in list_namespace_with_http_info
09-27 10:49:49.687      return self.api_client.call_api(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 348, in call_api
09-27 10:49:49.687      return self.__call_api(resource_path, method,
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
09-27 10:49:49.687      response_data = self.request(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 373, in request
09-27 10:49:49.687      return self.rest_client.GET(url,
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/rest.py", line 240, in GET
09-27 10:49:49.687      return self.request("GET", url,
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/kubernetes/client/rest.py", line 213, in request
09-27 10:49:49.687      r = self.pool_manager.request(method, url,
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/request.py", line 74, in request
09-27 10:49:49.687      return self.request_encode_url(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url
09-27 10:49:49.687      return self.urlopen(method, url, **extra_kw)
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen
09-27 10:49:49.687      response = conn.urlopen(method, u.request_uri, **kw)
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen
09-27 10:49:49.687      return self.urlopen(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen
09-27 10:49:49.687      return self.urlopen(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 815, in urlopen
09-27 10:49:49.687      return self.urlopen(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
09-27 10:49:49.687      retries = retries.increment(
09-27 10:49:49.687    File "cerberus_jenkins/venv3/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
09-27 10:49:49.688      raise MaxRetryError(_pool, url, error or ResponseError(cause))
09-27 10:49:49.688  urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.**.qe.gcp.devcluster.openshift.com', port=6443): Max retries exceeded with url: /api/v1/namespaces?pretty=True&limit=100 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff5770d0370>: Failed to establish a new connection: [Errno -2] Name or service not known'))
chaitanyaenr commented 2 years ago

Fixed by #184. Closing the issue. Thanks.