Closed cradle77 closed 5 years ago
What chart version and can you provide the logs?
Hello,
Thanks a lot for getting back to me. I'm using the latest stable:
version: v0.2.3
appVersion: v0.4.1
I've also tried 0.2.1/0.4.0 - no changes.
In the pod logs I see just
I1224 12:10:06.552546 1 authorization.go:73] Forbidden: "/healthz", Reason: ""
I1224 12:10:06.553374 1 wrap.go:42] GET /healthz: (12.814884ms) 403 [[kube-probe/1.11] 10.240.0.4:37092]
I1224 12:10:08.774693 1 authorization.go:73] Forbidden: "/healthz", Reason: ""
I1224 12:10:08.774962 1 wrap.go:42] GET /healthz: (365.199µs) 403 [[kube-probe/1.11] 10.240.0.4:37116]
Node logs don't say much more than that:
I1224 12:11:08.779285 3160 prober.go:111] Liveness probe for "torrid-mite-prometheus-adapter-7555cf57fd-8wtzr_default(580bc002-0774-11e9-ac59-8ef8c1dc0bcc):prometheus-adapter" failed (failure): HTTP probe failed with statuscode: 403
I1224 12:11:16.540541 3160 prober.go:111] Readiness probe for "torrid-mite-prometheus-adapter-7555cf57fd-8wtzr_default(580bc002-0774-11e9-ac59-8ef8c1dc0bcc):prometheus-adapter" failed (failure): HTTP probe failed with statuscode: 403
Are there any other logs that might be helpful in diagnosing this?
Thanks! m.
I don't think you provided enough of the logs. Most likely the health check fails if it can't connect to Prometheus. Verify your Prometheus url and port are correct. Verify your Prometheus instance doesn't require some form of authentication. Try increasing the logLevel to a higher value.
Same issue, logs:
I1228 11:12:50.857652 1 round_trippers.go:383] POST https://XXXXX.hcp.westeurope.azmk8s.io:443/apis/authorization.k8s.io/v1beta1/subjectaccessreviews
I1228 11:12:50.857658 1 round_trippers.go:390] Request Headers:
I1228 11:12:50.857663 1 round_trippers.go:393] Content-Type: application/json
I1228 11:12:50.857666 1 round_trippers.go:393] User-Agent: adapter/v0.0.0 (linux/amd64) kubernetes/$Format
I1228 11:12:50.857670 1 round_trippers.go:393] Authorization: Bearer XXXXXX
I1228 11:12:50.857675 1 round_trippers.go:393] Accept: application/json, */*
I1228 11:12:50.909221 1 round_trippers.go:408] Response Status: 201 Created in 51 milliseconds
I1228 11:12:50.909239 1 round_trippers.go:411] Response Headers:
I1228 11:12:50.909243 1 round_trippers.go:414] Content-Type: application/json
I1228 11:12:50.909246 1 round_trippers.go:414] Content-Length: 267
I1228 11:12:50.909249 1 round_trippers.go:414] Date: Fri, 28 Dec 2018 11:12:50 GMT
I1228 11:12:50.909266 1 request.go:897] Response Body: {"kind":"SubjectAccessReview","apiVersion":"authorization.k8s.io/v1beta1","metadata":{"creationTimestamp":null},"spec":{"nonResourceAttributes":{"path":"/healthz","verb":"get"},"user":"system:anonymous","group":["system:unauthenticated"]},"status":{"allowed":false}}
I1228 11:12:50.909325 1 authorization.go:73] Forbidden: "/healthz", Reason: ""
I1228 11:12:50.909624 1 wrap.go:42] GET /healthz: (52.281729ms) 403 [[kube-probe/1.11] 10.200.20.126:58960]
I1228 11:12:52.917140 1 authorization.go:73] Forbidden: "/healthz", Reason: ""
I1228 11:12:52.917208 1 wrap.go:42] GET /healthz: (134.8µs) 403 [[kube-probe/1.11] 10.200.20.126:58984]
I1228 11:13:00.857800 1 authorization.go:73] Forbidden: "/healthz", Reason: ""
I1228 11:13:00.857910 1 wrap.go:42] GET /healthz: (181.899µs) 403 [[kube-probe/1.11] 10.200.20.126:59022]
What authorization-mode are you using for your kube-apiserver? Mine is --authorization-mode=Node,RBAC. I think having Node there is the magic which makes it possible for kubelet to GET /healthz successfully.
I am seeing the same issue on my AKS:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m13s default-scheduler Successfully assigned default/prome-adapter-prometheus-adapter-559c98948d-n82vs to aks-agentpool-35064155-1
Normal Pulled 71s (x3 over 3m11s) kubelet, aks-agentpool-35064155-1 Container image "directxman12/k8s-prometheus-adapter-amd64:v0.4.1" already present on machine
Normal Created 71s (x3 over 3m11s) kubelet, aks-agentpool-35064155-1 Created container
Normal Killing 71s (x2 over 2m11s) kubelet, aks-agentpool-35064155-1 Killing container with id docker://prometheus-adapter:Container failed liveness probe.. Container will be killed and recreated.
Normal Started 70s (x3 over 3m11s) kubelet, aks-agentpool-35064155-1 Started container
Warning Unhealthy 32s (x7 over 2m32s) kubelet, aks-agentpool-35064155-1 Readiness probe failed: HTTP probe failed with statuscode: 403
Warning Unhealthy 32s (x7 over 2m32s) kubelet, aks-agentpool-35064155-1 Liveness probe failed: HTTP probe failed with statuscode: 403
It's not usuable but as it seems a common issue on aks, I cc some aks people which could help here. Feel free to come or not as your time is yours. @tariq1890 @jackfrancis @mboersma
It appears that in AKS the system:discovery
role does not allow unauthenticated users (which is what a liveness probe is). You can get the adapter working by adding this:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: healthz
rules:
- nonResourceURLs: ["/healthz", "/healthz/*"]
verbs: ["get", "post"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: healthz
subjects:
- kind: Group
name: system:unauthenticated
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: healthz
apiGroup: rbac.authorization.k8s.io
Disclaimer: this is definitely too open of a policy and I'd recommend figuring out what the minimum required is.
Thanks @grampelberg, I'll give it a try and let you know!
@cradle77 after digging in a little bit more, this RBAC is actually part of the default. So, folks running into this aren't running the default bootstrap policy (likely for good reasons).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG
Version of Helm and Kubernetes: Kubernetes 1.11.5 (in AKS, with RBAC enabled) Helm 2.10.0
Which chart: stable/prometheus-adapter
What happened: When installing the chart, the adapter pod's readiness and liveness probes fail with 403. This causes a CrashLoopback in the pod.
What you expected to happen: The Pod becoming ready and available
How to reproduce it (as minimally and precisely as possible): Create a cluster in AKS, with RBAC enabled install helm run
Anything else we need to know: This is what I see when describing the POD