Closed mabushey closed 4 years ago
Metrics server may fail to authenticate if kubelet is running with --anonymous-auth=false
flag.
Passing --authentication-token-webhook=true
and --authorization-mode=Webhook
flags to kubelet can fix this.
kops config for kubelet:
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
This might break authorization for kubelet-api user if ClusterRoleBinding
is not created with system:kubelet-api-admin
. Which can be fixed by creating the ClusterRoleBinding
:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubelet-api-admin
subjects:
- kind: User
name: kubelet-api
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:kubelet-api-admin
apiGroup: rbac.authorization.k8s.io
@mabushey I believe using "args" is slightly better than "command", it respects the entrypoint.
- args:
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
@githubcdr
Thanks for the comment, I agree that this seems better, however command
works, args
does not.
I have created metric server with below deployment and addedd kubelet config in kops but I still get 401
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: gcr.io/google_containers/metrics-server-amd64:v0.3.1
imagePullPolicy: Always
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
- --kubelet-insecure-tls
volumeMounts:
- name: tmp-dir
mountPath: /tmp
Logs:
E0209 22:52:55.288570 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-20-64-197.compute.internal: unable to fetch metrics from Kubelet ip-172-20-64-197.compute.internal (172.20.64.197): request failed - "401 Unauthorized", response: "Unauthorized", unable to fully scrape metrics from source kubelet_summary:ip-172-20-100-28.compute.internal: unable to get CPU for container "sentinel" in pod default/redis-sentinel-744bj on node "172.20.100.28", discarding data: missing cpu usage metric]
E0209 22:53:55.273084 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-20-117-178.us-west-2.compute.internal: unable to get CPU for container "nginx-ingress" in pod default/nginx-ingress-rc-gj9tl on node "172.20.117.178", discarding data: missing cpu usage metric, unable to fully scrape metrics from source kubelet_summary:ip-172-20-64-197.compute.internal: unable to fetch metrics from Kubelet ip-172-20-64-197.compute.internal (172.20.64.197): request failed - "401 Unauthorized", response: "Unauthorized"]
E0209 22:54:55.286313 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:ip-172-20-64-197.compute.internal: unable to fetch metrics from Kubelet ip-172-20-64-197.compute.internal (172.20.64.197): request failed - "401 Unauthorized", response: "Unauthorized"
E0209 22:56:55.264838 1 manager.go:102] unable to fully collect metrics: unable to extract connection information for node "ip-172-20-69-45.compute.internal": node ip-172-20-69-45.compute.internal is not ready
E0209 22:57:55.266908 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:ip-172-20-69-45.compute.internal: unable to get CPU for container "kafka" in pod default/kafka-1 on node "172.20.69.45", discarding data: missing cpu usage metric
E0209 22:59:55.255294 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:ip-172-20-117-178.compute.internal: unable to get CPU for container "nginx-ingress" in pod default/nginx-ingress-rc-gj9tl on node "172.20.117.178", discarding data: missing cpu usage metric
E0209 23:03:52.091447 1 reststorage.go:144] unable to fetch pod metrics for pod default/baker-xxx: no metrics known for pod
E0209 23:05:55.297098 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:ip-172-20-117-178.compute.internal: unable to get CPU for container "nginx-ingress" in pod default/nginx-ingress-rc-gj9tl on node "172.20.117.178", discarding data: missing cpu usage metric
Does metrics-server service account have access to "nodes/stats" resource? Example: https://github.com/serathius/kubernetes/blob/53b13b66645923e231e8f7950932c61f02f1c276/cluster/addons/metrics-server/resource-reader.yaml#L14
It works now. I am not sure what caused it to work.
kubectl describe clusterrole system:metrics-server
Name: system:metrics-server
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"name":"system:metrics-server","namespace":""},"rules":[...
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
namespaces [] [] [get list watch]
nodes/stats [] [] [get list watch]
nodes [] [] [get list watch]
pods [] [] [get list watch]
deployments.extensions [] [] [get list watch]
@zahid0 I'm still facing the issue, I tried to edit cluster and update the metric yaml file. Below are my changes:
# kops edit cluster
kubeAPIServer:
kubeletPreferredAddressTypes:
- InternalIP
- Hostname
- InternalDNS
- ExternalDNS
- ExternalIP
runtimeConfig:
autoscaling/v2beta1: "true"
kubeControllerManager:
horizontalPodAutoscalerUseRestClients: false
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
# metrics-server.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:aggregated-metrics-reader
labels:
rbac.authorization.k8s.io/aggregate-to-view: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
imagePullPolicy: Always
volumeMounts:
- name: tmp-dir
mountPath: /tmp
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
---
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/name: "Metrics-server"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: 443
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubelet-api-admin
subjects:
- kind: User
name: kubelet-api
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:kubelet-api-admin
apiGroup: rbac.authorization.k8s.io
@rajeshkodali I also ran your command, I don't think it works
kubectl describe clusterrole system:metrics-server
Name: system:metrics-server
Labels: <none>
Annotations: <none>
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
nodes/stats [] [] [get list watch]
nodes [] [] [get list watch]
pods [] [] [get list watch]
I'm facing the same problems as @vinhnglx despite making all the fixes mentioned on this issue. :(
guys, any idea? I spent a few hours today but still can't make it works.
Here is my working config: kops cluster spec for kubelet.
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
Metrics server yaml:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- deployments
verbs:
- get
- list
- watch
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:aggregated-metrics-reader
labels:
rbac.authorization.k8s.io/aggregate-to-view: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
---
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/name: "Metrics-server"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: 443
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: gcr.io/google_containers/metrics-server-amd64:v0.3.1
imagePullPolicy: Always
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
- --kubelet-insecure-tls
volumeMounts:
- name: tmp-dir
mountPath: /tmp
role.yaml from https://github.com/kubernetes/kops/issues/5706 and
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:kubelet-api-admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
subjects:
- kind: User
name: kubelet-api
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:kubelet-api-admin
apiGroup: rbac.authorization.k8s.io
thanks @rajeshkodali .
I still hit the error "401 Unauthorized"
E0214 03:16:54.413600 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-10-10-2-189.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-2-189.ap-southeast-1.compute.internal (10.10.2.189): request failed - "401 Unauthorized", response: "Unauthorized", unable to fully scrape metrics from source kubelet_summary:ip-10-10-1-140.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-1-140.ap-southeast-1.compute.internal (10.10.1.140): request failed - "401 Unauthorized", response: "Unauthorized", unable to fully scrape metrics from source kubelet_summary:ip-10-10-1-124.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-1-124.ap-southeast-1.compute.internal (10.10.1.124): request failed - "401 Unauthorized", response: "Unauthorized"]
E0214 03:17:03.349620 1 reststorage.go:144] unable to fetch pod metrics for pod default/backend-14-feb-2019-10-20-15-5bb5b77bcc-stb4t: no metrics known for pod
E0214 03:17:17.633193 1 reststorage.go:144] unable to fetch pod metrics for pod default/frontend-14-feb-2019-10-20-04-5d5c4678bc-k7vpv: no metrics known for pod
E0214 03:17:33.357307 1 reststorage.go:144] unable to fetch pod metrics for pod default/backend-14-feb-2019-10-20-15-5bb5b77bcc-stb4t: no metrics known for pod
What's the output of kubectl --v=10 top nodes ?
On Wed, Feb 13, 2019 at 7:19 PM Vincent Nguyen notifications@github.com wrote:
thanks @rajeshkodali https://github.com/rajeshkodali .
I still hit the error "401 Unauthorized"
E0214 03:16:54.413600 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-10-10-2-189.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-2-189.ap-southeast-1.compute.internal (10.10.2.189): request failed - "401 Unauthorized", response: "Unauthorized", unable to fully scrape metrics from source kubelet_summary:ip-10-10-1-140.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-1-140.ap-southeast-1.compute.internal (10.10.1.140): request failed - "401 Unauthorized", response: "Unauthorized", unable to fully scrape metrics from source kubelet_summary:ip-10-10-1-124.ap-southeast-1.compute.internal: unable to fetch metrics from Kubelet ip-10-10-1-124.ap-southeast-1.compute.internal (10.10.1.124): request failed - "401 Unauthorized", response: "Unauthorized"] E0214 03:17:03.349620 1 reststorage.go:144] unable to fetch pod metrics for pod default/backend-14-feb-2019-10-20-15-5bb5b77bcc-stb4t: no metrics known for pod E0214 03:17:17.633193 1 reststorage.go:144] unable to fetch pod metrics for pod default/frontend-14-feb-2019-10-20-04-5d5c4678bc-k7vpv: no metrics known for pod E0214 03:17:33.357307 1 reststorage.go:144] unable to fetch pod metrics for pod default/backend-14-feb-2019-10-20-15-5bb5b77bcc-stb4t: no metrics known for pod
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-incubator/metrics-server/issues/212#issuecomment-463471692, or mute the thread https://github.com/notifications/unsubscribe-auth/ATHASXG-YIiioGZZ0xwSCZRb9ai2nUfqks5vNNWogaJpZM4aXFVd .
-- Rajesh Kodali Sr. DevOps Engineer
The output when running that command: error: metrics not available yet
0214 11:32:45.819869 83755 loader.go:359] Config loaded from file /Users/developers/.kube/config
I0214 11:32:45.822017 83755 loader.go:359] Config loaded from file /Users/developers/.kube/config
I0214 11:32:45.822606 83755 round_trippers.go:419] curl -k -v -XGET -H "User-Agent: kubectl/v1.13.2 (darwin/amd64) kubernetes/cff46ab" -H "Accept: applicat
ion/json, */*" -H "Authorization: Basic xxxxxxxxxx=" 'https://api.xxx.xxx.com/api?timeout=32s
'
I0214 11:32:45.870240 83755 round_trippers.go:438] GET https://api.xxx.xxx.com/api?timeout=32s 200 OK in 47 milliseconds
I0214 11:32:45.870286 83755 round_trippers.go:444] Response Headers:
I0214 11:32:45.870299 83755 round_trippers.go:447] Content-Type: application/json
I0214 11:32:45.870313 83755 round_trippers.go:447] Content-Length: 133
I0214 11:32:45.870323 83755 round_trippers.go:447] Date: Thu, 14 Feb 2019 03:32:45 GMT
I0214 11:32:45.870412 83755 request.go:942] Response Body: {"kind":"APIVersions","versions":["v1"],"serverAddressByClientCIDRs":[{"clientCIDR":"0.0.0.0/0","
serverAddress":"10.10.1.124:443"}]}
I0214 11:32:45.870842 83755 round_trippers.go:419] curl -k -v -XGET -H "Authorization: Basic xxxxxx=" -H "Acce
pt: application/json, */*" -H "User-Agent: kubectl/v1.13.2 (darwin/amd64) kubernetes/cff46ab" 'https://api.xxx.xxx.com/apis?timeout=32
s'
I0214 11:32:45.880593 83755 round_trippers.go:438] GET https://api.xxx.xxx.com/apis?timeout=32s 200 OK in 9 milliseconds
I0214 11:32:45.880617 83755 round_trippers.go:444] Response Headers:
I0214 11:32:45.880627 83755 round_trippers.go:447] Content-Type: application/json
I0214 11:32:45.880636 83755 round_trippers.go:447] Content-Length: 3609
I0214 11:32:45.880645 83755 round_trippers.go:447] Date: Thu, 14 Feb 2019 03:32:45 GMT
I0214 11:32:45.880709 83755 request.go:942] Response Body: {"kind":"APIGroupList","apiVersion":"v1","groups":[{"name":"apiregistration.k8s.io","versions":[{"groupVersion":"apiregistration.k8s.io/v1","version":"v1"},{"groupVersion":"apiregistration.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"apiregistration.k8s.io/v1","version":"v1"}},{"name":"extensions","versions":[{"groupVersion":"extensions/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"extensions/v1beta1","version":"v1beta1"}},{"name":"apps","versions":[{"groupVersion":"apps/v1","version":"v1"},{"groupVersion":"apps/v1beta2","version":"v1beta2"},{"groupVersion":"apps/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"apps/v1","version":"v1"}},{"name":"events.k8s.io","versions":[{"groupVersion":"events.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"events.k8s.io/v1beta1","version":"v1beta1"}},{"name":"authentication.k8s.io","versions":[{"groupVersion":"authentication.k8s.io/v1","version":"v1"},{"groupVersion":"authentication.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"authentication.k8s.io/v1","version":"v1"}},{"name":"authorization.k8s.io","versions":[{"groupVersion":"authorization.k8s.io/v1","version":"v1"},{"groupVersion":"authorization.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"authorization.k8s.io/v1","version":"v1"}},{"name":"autoscaling","versions":[{"groupVersion":"autoscaling/v1","version":"v1"},{"groupVersion":"autoscaling/v2beta1","version":"v2beta1"}],"preferredVersion":{"groupVersion":"autoscaling/v1","version":"v1"}},{"name":"batch","versions":[{"groupVersion":"batch/v1","version":"v1"},{"groupVersion":"batch/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"batch/v1","version":"v1"}},{"name":"certificates.k8s.io","versions":[{"groupVersion":"certificates.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"certificates.k8s.io/v1beta1","version":"v1beta1"}},{"name":"networking.k8s.io","versions":[{"groupVersion":"networking.k8s.io/v1","version":"v1"}],"preferredVersion":{"groupVersion":"networking.k8s.io/v1","version":"v1"}},{"name":"policy","versions":[{"groupVersion":"policy/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"policy/v1beta1","version":"v1beta1"}},{"name":"rbac.authorization.k8s.io","versions":[{"groupVersion":"rbac.authorization.k8s.io/v1","version":"v1"},{"groupVersion":"rbac.authorization.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"rbac.authorization.k8s.io/v1","version":"v1"}},{"name":"storage.k8s.io","versions":[{"groupVersion":"storage.k8s.io/v1","version":"v1"},{"groupVersion":"storage.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"storage.k8s.io/v1","version":"v1"}},{"name":"admissionregistration.k8s.io","versions":[{"groupVersion":"admissionregistration.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"admissionregistration.k8s.io/v1beta1","version":"v1beta1"}},{"name":"apiextensions.k8s.io","versions":[{"groupVersion":"apiextensions.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"apiextensions.k8s.io/v1beta1","version":"v1beta1"}},{"name":"scheduling.k8s.io","versions":[{"groupVersion":"scheduling.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"scheduling.k8s.io/v1beta1","version":"v1beta1"}},{"name":"metrics.k8s.io","versions":[{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"metrics.k8s.io/v1beta1","version":"v1beta1"}}]}
I0214 11:32:45.896066 83755 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.13.2 (darwin/amd64) kubernetes/cff46ab" -H "Authorization: Basic xxxxxxx=" 'https://api.xxx.xxx.com/apis/metrics.k8s.io/v1beta1/nodes'
I0214 11:32:45.906012 83755 round_trippers.go:438] GET https://api.xxx.xxx.com/apis/metrics.k8s.io/v1beta1/nodes 200 OK in 9 milliseconds
I0214 11:32:45.906056 83755 round_trippers.go:444] Response Headers:
I0214 11:32:45.906107 83755 round_trippers.go:447] Date: Thu, 14 Feb 2019 03:32:45 GMT
I0214 11:32:45.906134 83755 round_trippers.go:447] Content-Length: 137
I0214 11:32:45.906165 83755 round_trippers.go:447] Content-Type: application/json
I0214 11:32:45.906205 83755 request.go:942] Response Body: {"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[]}
F0214 11:32:45.907600 83755 helpers.go:116] error: metrics not available yet
@vinhnglx could you check the arguments passed to kubelet
on one of the nodes? If you run kubelet using systemd
, then ssh to instance and run sudo systemctl status kubelet
. Make sure --authentication-token-webhook=true
and --authorization-mode=Webhook
flags are passed.
Checking kubectl
logs may also help (run journalctl -u kubelet
on the node for logs).
@zahid0 I'm using Kops to install Kubernetes with vpc, private subnets, and calico CNI for networking. I'm not able to ssh to the instance to check the kubelet.
But I already set the authentication-token-webhook=true
and authorization-mode=Webhook
using kops edit cluster
command
# kops edit cluster
kind: Cluster
metadata:
creationTimestamp: 2019-01-29T06:45:14Z
name: xxx.xxx.com
spec:
# ...
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
And it still shows the 401 Unauthorized
@vinhnglx do you mind showing the output of kops update cluster
and kops rolling-update cluster
...
@zahid0 I don't mind, but I'm using kops with terraform output.
My steps are:
@vinhnglx kops rolling-update
is required after terraform apply
. https://github.com/kubernetes/kops/blob/master/docs/terraform.md#caveats
@zahid0 Oh, no. My script has a mistake, should run kops rolling-update but I ran kops update. Now it works.
I can get the metrics. Thanks a lot for your help :)
kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
ip-1-2-3-4.ap-southeast-1.compute.internal 160m 8% 1585Mi 41%
ip-1-2-3-4.ap-southeast-1.compute.internal 1151m 57% 2397Mi 30%
ip-1-2-3-4.ap-southeast-1.compute.internal 1005m 50% 2769Mi 35%
I changed from using "args" to "commands" and I don't see the 401 Unauthorized
now. However, kubectl logs -f metrics-server... -n kube-system
still shows "no metrics known for pod":
$ k logs -f metrics-server-68df9fbc9f-dgr8v -n kube-system
I0312 03:55:41.841800 1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
W0312 03:55:42.433339 1 authentication.go:166] cluster doesn't provide client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication to extension api-server won't work.
W0312 03:55:42.439873 1 authentication.go:210] cluster doesn't provide client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication to extension api-server won't work.
[restful] 2019/03/12 03:55:42 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi
[restful] 2019/03/12 03:55:42 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/
I0312 03:55:42.488139 1 serve.go:96] Serving securely on [::]:443
E0312 03:55:46.554516 1 reststorage.go:144] unable to fetch pod metrics for pod default/iconverse-nlp-0: no metrics known for pod
E0312 03:55:46.554540 1 reststorage.go:144] unable to fetch pod metrics for pod default/iconverse-nlp-1: no metrics known for pod
E0312 05:08:42.634201 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-192-168-84-18.ap-southeast-1.compute.internal: [unable to get CPU for container "iconverse-connector" in pod default/iconverse-connector-0 on node "192.168.84.18", discarding data: missing cpu usage metric, unable to get CPU for container "iconverse-fluentd" in pod default/iconverse-connector-0 on node "192.168.84.18", discarding data: missing cpu usage metric], unable to fully scrape metrics from source kubelet_summary:ip-192-168-22-244.ap-southeast-1.compute.internal: [unable to get CPU for container "iconverse-fluentd" in pod default/iconverse-converse-0 on node "192.168.22.244", discarding data: missing cpu usage metric, unable to get CPU for container "iconverse-converse" in pod default/iconverse-converse-0 on node "192.168.22.244", discarding data: missing cpu usage metric, unable to get CPU for container "iconverse-fluentd" in pod default/iconverse-admin-0 on node "192.168.22.244", discarding data: missing cpu usage metric, unable to get CPU for container "iconverse-admin" in pod default/iconverse-admin-0 on node "192.168.22.244", discarding data: missing cpu usage metric, unable to get CPU for container "iconverse-ui" in pod default/iconverse-ui-0 on node "192.168.22.244", discarding data: missing cpu usage metric]]
kubectl top nodes
show valid data with resource percentage. kubectl top pod
does not have any percentage at all.
@zahid0, how to add the kubelet config when I am using eksctl to create the cluster on AWS EKS?
@serathius Thanks for pointing out node/stats in https://github.com/serathius/kubernetes/blob/53b13b66645923e231e8f7950932c61f02f1c276/cluster/addons/metrics-server/resource-reader.yaml#L14
This should be PR back to upstream.
I experience the same as @khteh above.
I have set the following kubelet arguments:
--authentication-token-webhook
--authorization-mode Webhook
The kubectl top nodes
show valid data with resource percentage. kubectl top pods
does not have any percentage, but shows values.
The metrics-server logs:
$ k logs -n kube-system metrics-server-5dddc64cb8-t5bqp
I0610 20:40:55.524499 1 serving.go:273] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
W0610 20:40:56.190762 1 authentication.go:166] cluster doesn't provide client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication to extension api-server won't work.
W0610 20:40:56.199965 1 authentication.go:210] cluster doesn't provide client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication to extension api-server won't work.
[restful] 2019/06/10 20:40:56 log.go:33: [restful/swagger] listing is available at https://:8443/swaggerapi
[restful] 2019/06/10 20:40:56 log.go:33: [restful/swagger] https://:8443/swaggerui/ is mapped to folder /swagger-ui/
I0610 20:40:56.251231 1 serve.go:96] Serving securely on [::]:8443
E0610 20:41:28.930599 1 reststorage.go:148] unable to fetch pod metrics for pod default/myref-res-search-55bb66d4c4-6689m: no metrics known for pod
E0610 20:41:43.942248 1 reststorage.go:148] unable to fetch pod metrics for pod default/myref-res-search-55bb66d4c4-6689m: no metrics known for pod
I am using AWS EKS with Kubernetes 1.12 and installed the metric-server from helm (https://github.com/helm/charts/tree/master/stable/metrics-server):
$ helm install stable/metrics-server \
--name metrics-server \
--version 2.8.2 \
--namespace kube-system \
--set args={"--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP"}
The pod is deployed with a HorizontalPodAutoscaler:
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "myref-resolution-search.fullname" . }}
labels:
app: {{ include "myref-resolution-search.fullname" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version }}
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "myref-resolution-search.fullname" . }}
minReplicas: {{ .Values.autoscaling.minReplicas }}
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
Still no percentages:
$ k top pods
NAME CPU(cores) MEMORY(bytes)
myref-res-search-55bb66d4c4-49qpx 137m 1027Mi
Any idea why the kubectl top pods
shows no percentage at all?
E0625 13:13:04.145733 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:172.20.0.217: unable to get CPU for container "java-springboot-web" in pod java-springboot-web-b8cff79f5-w9bqq on node "172.20.0.217", discarding data: missing cpu usage metric
The kubectl top nodes show valid data with resource percentage. kubectl top pods does not have any percentage, but shows values.
[root@blv0155 metrics]# kubectl top pod -n smix3
NAME CPU(cores) MEMORY(bytes)
java-springboot-web-b8cff79f5-vknhh 3m 428Mi
[root@blv0155 metrics]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
172.17.71.208 51m 2% 1933Mi 52%
172.17.71.209 114m 2% 1862Mi 24%
https://172.17.71.208:6443/apis/metrics.k8s.io/v1beta1/pods
"metadata": {
"name": "java-springboot-web-b8cff79f5-vknhh",
"namespace": "smix3",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/smix3/pods/java-springboot-web-b8cff79f5-vknhh",
"creationTimestamp": "2019-06-25T13:38:35Z"
},
"timestamp": "2019-06-25T13:38:32Z",
"window": "30s",
"containers": [
{
"name": "tutor-java-springboot-web",
"usage": {
"cpu": "982317n",
"memory": "470060Ki"
}
}
]
},
Deployment , Replicaset yaml
### spec: Resource : request : cpu :
yaml write have to
metric info unknown -> 20%, 30% change
Here is my working config: kops cluster spec for kubelet.
kubelet: anonymousAuth: false authenticationTokenWebhook: true authorizationMode: Webhook
Metrics server yaml:
--- apiVersion: v1 kind: ServiceAccount metadata: name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:metrics-server rules: - apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces verbs: - get - list - watch - apiGroups: - "extensions" resources: - deployments verbs: - get - list - watch --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: system:aggregated-metrics-reader labels: rbac.authorization.k8s.io/aggregate-to-view: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-admin: "true" rules: - apiGroups: ["metrics.k8s.io"] resources: ["pods"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: apiregistration.k8s.io/v1beta1 kind: APIService metadata: name: v1beta1.metrics.k8s.io spec: service: name: metrics-server namespace: kube-system group: metrics.k8s.io version: v1beta1 insecureSkipTLSVerify: true groupPriorityMinimum: 100 versionPriority: 100 --- apiVersion: v1 kind: Service metadata: name: metrics-server namespace: kube-system labels: kubernetes.io/name: "Metrics-server" spec: selector: k8s-app: metrics-server ports: - port: 443 protocol: TCP targetPort: 443 --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: metrics-server namespace: kube-system labels: k8s-app: metrics-server spec: selector: matchLabels: k8s-app: metrics-server template: metadata: name: metrics-server labels: k8s-app: metrics-server spec: serviceAccountName: metrics-server volumes: # mount in tmp so we can safely use from-scratch images and/or read-only containers - name: tmp-dir emptyDir: {} containers: - name: metrics-server image: gcr.io/google_containers/metrics-server-amd64:v0.3.1 imagePullPolicy: Always command: - /metrics-server - --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP - --kubelet-insecure-tls volumeMounts: - name: tmp-dir mountPath: /tmp
role.yaml from kubernetes/kops#5706 and
kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: system:kubelet-api-admin annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults subjects: - kind: User name: kubelet-api apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:kubelet-api-admin apiGroup: rbac.authorization.k8s.io
This configuration worked me too. It took a lot of time to completely rollout for my cluster.
I got it working. Thanks, Team.
should we aggregate all of these setup problems into one ticket? e.g. #278 Also, it might be an idea to have an option to make node stats requests via the api proxy. That would avoid the need for users to change kubelet auth setup.
@zoltan-fedor @zhanghan12
I had similar issue (though kubectl top pods
still doesn't show % for me), but basically HPA works now.
My setup is kops 1.13 with k8s 1.13.5 & istio
I have setup similar to other comments:
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
And running metrics-server as:
/metrics-server
--v=10
--kubelet-insecure-tls
--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
What really helped me is able to debug the issue by setting the log level..
And i found this article by rancher
By default, HPA will try to read metrics (resource and custom) with user system:anonymous
Following the guide and creating the additional ClusterRole binding for system:anonymous
user seem to have fixed the issue for me
@zahid0 Thank you so much, you saved my day
this works thank you very much. saved some prod time :) https://github.com/kubernetes-sigs/metrics-server/issues/212#issuecomment-459321884
hi, what is the solution for that atm? and why does the error show at all?
a upgrade from kubernetes 1.15.0 to 1.15.6 helped and fixed the issue
@serathius is this fixed?
Closing per Kubernetes issue triage policy
GitHub is not the right place for support requests. If you're looking for help, check Stack Overflow and the troubleshooting guide. You can also post your question on the Kubernetes Slack or the Discuss Kubernetes forum. If the matter is security related, please disclose it privately via https://kubernetes.io/security/.
@zahid0 response above worked for me.
Metrics server may fail to authenticate if kubelet is running with
--anonymous-auth=false
flag.
Was a little confused about the ClusterRoleBinding, but eventually discovered it was already setup in my Kops deployment within the kube-system namespace.
k get ClusterRoleBinding kops:system:kubelet-api-admin -o yaml -n kube-system
Went back-n-forth a few times on the WebHook settiing to be sure; i.e., edit
, update
, rolling-update
. On each evaluation, all the nodes including master were dumped likely because the auth mode change.
Also applied @samuel-elliott hostNetwork: true recommendation.
@mabushey I believe using "args" is slightly better than "command", it respects the entrypoint.
- args: - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
This work nice for me on v1.26.9+k0s. Thanks a lot
Added "fixes" which reduces the errors:
git diff deploy/1.8+/metrics-server-deployment.yaml
kubectl -n kube-system logs -f metrics-server-68df9fbc9f-fsvgn
Is there a version that works (ie one of the 200 forks)? I've used k8s 1.10 and 1.11 on AWS via Kops.