openrca / orca

Root Cause Analysis for Kubernetes
https://openrca.io
Apache License 2.0
90 stars 10 forks source link

Failed to perform API request to Prometheus #38

Closed aleksandra-galara closed 4 years ago

aleksandra-galara commented 4 years ago

The following error occurs in the logs of orca pod:

11:09:32 probe.1 | During handling of the above exception, another exception occurred:                                                                                                                                                                                
11:09:32 probe.1 |                                                                                                      
11:09:32 probe.1 | Traceback (most recent call last):               
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/cotyledon/_utils.py", line 95, in exit_on_exception         
11:09:32 probe.1 |     yield                                                                                                                                                                                                                       
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/cotyledon/_service.py", line 139, in _run            
11:09:32 probe.1 |     self.run()                                                                                       
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/topology/probe.py", line 50, in run                
11:09:32 probe.1 |     probe.run()                                                        
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/topology/probe.py", line 103, in run                                                                                           
11:09:32 probe.1 |     self._synchronize()                                                                                                                                                                          
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/topology/probe.py", line 110, in _synchronize                                                                
11:09:32 probe.1 |     upstream_nodes = self._get_upstream_nodes()                                         
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/topology/probe.py", line 118, in _get_upstream_nodes  
11:09:32 probe.1 |     entities = self._upstream_proxy.get_all()                                          
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/topology/alerts/prometheus/upstream.py", line 23, in get_all
11:09:32 probe.1 |     return self._client.get_alerts()['data']['alerts']                                            
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/common/clients/prometheus/client.py", line 23, in get_alerts
11:09:32 probe.1 |     return self._connector.get("alerts")                                                         
11:09:32 probe.1 |   File "/usr/local/lib/python3.7/site-packages/orca/common/clients/rest_client.py", line 27, in get
11:09:32 probe.1 |     raise exceptions.APIClientError(reason=str(ex))                                                     
11:09:32 probe.1 | orca.common.clients.exceptions.APIClientError: Failed to perform API request: HTTPConnectionPool(host='prometheus-prometheus-oper-prometheus.monitoring', port=9090): Max retries exceeded with url: /api/v1/alerts (Caused by NewConnectionError($
<urllib3.connection.HTTPConnection object at 0x7f231e2ae890>: Failed to establish a new connection: [Errno -3] Try again')). 
bzurkowski commented 4 years ago

It seems Prometheus service is not correctly installed or the configured address of the host is invalid.

Please, check the valid address for Prometheus service:

$ kubectl -n monitoring get svc
NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                     ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   33d
prometheus-grafana                        ClusterIP   10.233.63.92    <none>        80/TCP                       33d
prometheus-kube-state-metrics             ClusterIP   10.233.16.9     <none>        8080/TCP                     33d
prometheus-operated                       ClusterIP   None            <none>        9090/TCP                     33d
prometheus-prometheus-node-exporter       ClusterIP   10.233.32.79    <none>        9100/TCP                     33d
prometheus-prometheus-oper-alertmanager   ClusterIP   10.233.0.134    <none>        9093/TCP                     33d
prometheus-prometheus-oper-operator       ClusterIP   10.233.57.250   <none>        8080/TCP,443/TCP             33d
prometheus-prometheus-oper-prometheus     ClusterIP   10.233.2.151    <none>        9090/TCP                     33d

Then, set it in orca config map:

$ kubectl -n rca edit cm orca
  orca.yaml: |
  ...
    prometheus:
      url: http://prometheus-prometheus-oper-prometheus.monitoring:9090
      resync_period: 60

Then, restart orca pod.