cytopia / metrics-server-prom

Prometheus adapter to scrape from Kubernetes metrics-server
MIT License
21 stars 8 forks source link

Errors: KeyError: '<pod_name>' and AttributeError: 'NoneType' object has no attribute 'group' #10

Closed kanewinter closed 4 years ago

kanewinter commented 5 years ago

I set this up in my kubernetes cluster

[init] Checking for kube config ...OK [init] KUBE_CONTEXT not set, relying on context set in /etc/kube/config [init] Checking current kube context ...OK [init] Current context: aws [init] Checking cluster info ...OK 2019-03-01 01:45:18,634 CRIT Set uid to user 0 2019-03-01 01:45:18,636 INFO supervisord started with pid 1 2019-03-01 01:45:19,638 INFO spawned: 'transformer' with pid 32 2019-03-01 01:45:19,640 INFO spawned: 'kubectl-proxy' with pid 33 [uWSGI] getting INI configuration from uwsgi.ini Starting uWSGI 2.0.18 (64bit) on [Fri Mar 1 01:45:19 2019] compiled with version: 6.3.0 20170516 on 01 March 2019 01:10:52 os: Linux-4.14.88-88.76.amzn2.x86_64 #1 SMP Mon Jan 7 18:43:26 UTC 2019 nodename: metrics-server-2-prom-755b964746-p6rhc machine: x86_64 clock source: unix detected number of CPU cores: 2 current working directory: /home/prometheus/transform detected binary path: /usr/local/bin/uwsgi !!! no internal routing support, rebuild with pcre support !!! your memory page size is 4096 bytes detected max file descriptor number: 65536 lock engine: pthread robust mutexes thunder lock: disabled (you can enable it with --thunder-lock) uwsgi socket 0 bound to TCP address 0.0.0.0:9100 fd 3 Starting to serve on 127.0.0.1:8080 Python version: 3.7.2 (default, Feb 6 2019, 03:40:10) [GCC 6.3.0 20170516] Set PythonHome to /home/prometheus/transform/transformenv Python threads support is disabled. You can enable it with --enable-threads Python main interpreter initialized at 0x562b2e793850 your server socket listen backlog is limited to 100 connections your mercy for graceful operations on workers is 60 seconds mapped 218712 bytes (213 KB) for 2 cores Operational MODE: preforking WSGI app 0 (mountpoint='') ready in 1 seconds on interpreter 0x562b2e793850 pid: 32 (default app) uWSGI is running in multiple interpreter mode spawned uWSGI master process (pid: 32) spawned uWSGI worker 1 (pid: 39, cores: 1) spawned uWSGI worker 2 (pid: 40, cores: 1) 2019-03-01 01:45:21,536 INFO success: transformer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2019-03-01 01:45:21,536 INFO success: kubectl-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) [pid: 40|app: 0|req: 1/1] 10.10.113.116 () {24 vars in 287 bytes} [Fri Mar 1 02:02:56 2019] GET / => generated 297 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0) [2019-03-01 02:17:36,085] ERROR in app: Exception on /metrics [GET] Traceback (most recent call last): File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise raise value File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "./transform.py", line 278, in metrics 'pods': trans_pod_metrics(json['pods']) File "./transform.py", line 199, in trans_pod_metrics more[lbl['pod']]['node'], KeyError: 'kube-proxy-9jqrt' [pid: 40|app: 0|req: 2/2] 10.10.115.20 () {28 vars in 466 bytes} [Fri Mar 1 02:17:35 2019] GET /metrics => generated 291 bytes in 329 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0) [2019-03-01 02:17:40,894] ERROR in app: Exception on /metrics [GET] Traceback (most recent call last): File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise raise value File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "./transform.py", line 278, in metrics 'pods': trans_pod_metrics(json['pods']) File "./transform.py", line 199, in trans_pod_metrics more[lbl['pod']]['node'], KeyError: 'ingress-nginx-ingress-default-backend-6679dd498c-s7cxc' [pid: 39|app: 0|req: 1/3] 10.10.115.20 () {28 vars in 466 bytes} [Fri Mar 1 02:17:40 2019] GET /metrics => generated 291 bytes in 138 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0) [pid: 40|app: 0|req: 3/4] 10.10.115.20 () {28 vars in 453 bytes} [Fri Mar 1 02:17:41 2019] GET / => generated 297 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0) [pid: 40|app: 0|req: 4/5] 10.10.115.20 () {28 vars in 491 bytes} [Fri Mar 1 02:17:42 2019] GET /apis/k8s.metrics.io => generated 233 bytes in 2 msecs (HTTP/1.1 404) 2 headers in 72 bytes (1 switches on core 0) [2019-03-01 02:17:45,886] ERROR in app: Exception on /metrics [GET] Traceback (most recent call last): File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise raise value File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "./transform.py", line 278, in metrics 'pods': trans_pod_metrics(json['pods']) File "./transform.py", line 199, in trans_pod_metrics more[lbl['pod']]['node'], KeyError: 'prometheus-prometheus-node-exporter-978th'

Also This:

2019-03-01 03:07:15,396 INFO success: transformer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2019-03-01 03:07:15,396 INFO success: kubectl-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) [pid: 72|app: 0|req: 1/1] 172.17.0.1 () {24 vars in 255 bytes} [Fri Mar 1 03:07:27 2019] GET / => generated 297 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0) [2019-03-01 03:07:37,133] ERROR in app: Exception on /metrics [GET] Traceback (most recent call last): File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise raise value File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "./transform.py", line 278, in metrics 'pods': trans_pod_metrics(json['pods']) File "./transform.py", line 174, in trans_pod_metrics more = get_pod_metrics_from_cli() File "./transform.py", line 238, in get_pod_metrics_from_cli 'ns': line.group(1), AttributeError: 'NoneType' object has no attribute 'group' [pid: 72|app: 0|req: 2/2] 172.17.0.1 () {24 vars in 269 bytes} [Fri Mar 1 03:07:33 2019] GET /metrics => generated 291 bytes in 3899 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0)

cytopia commented 5 years ago

Do you also have metrics-server deployed and configured correctly in Kubernetes?

kanewinter commented 5 years ago

Yes, I was able to pull down some stats with the API.

kanewinter commented 5 years ago

Using gcr.io/google_containers/metrics-server-amd64:v0.3.3 and metrics-server-prom 0.9.0.

[2019-07-18 17:25:19,457] ERROR in app: Exception on /metrics [GET] Traceback (most recent call last): File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functionsrule.endpoint File "./transform.py", line 275, in metrics 'pods': trans_pod_metrics(json['pods']) File "./transform.py", line 171, in trans_pod_metrics more = get_pod_metrics_from_cli() File "./transform.py", line 235, in get_pod_metrics_from_cli 'ns': line.group(1), AttributeError: 'NoneType' object has no attribute 'group' [pid: 92|app: 0|req: 3/3] 172.17.0.1 () {36 vars in 680 bytes} [Thu Jul 18 17:25:12 2019] GET /metrics => generated 290 bytes in 7243 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0)

cytopia commented 4 years ago

This is will be dealt with here: #12

cytopia commented 4 years ago

Solved