Open halohsu opened 1 year ago
Help me pls!!
Hi, @bluemiaomiao ,Has the issue been solved?
metrics-server needs two scripe cycles to provide metrics.
If kubectl top node
has no metrics after two scripe cycles, please contine to provide the logs of metrics-srever
@yangjunmyfm192085 No indicators have been provided yet, and I haven’t investigated what happened internally.
I have almost similar problem! I use the helm chart to install kube-metrics in one master and one worker, kubectl top node
doesn't work like @bluemiaomiao.
hostnetwork: true
in values.yaml is useful, but why metrics-server in CNI doesn't work?
/assign @yangjunmyfm192085 /triage accepted
Hi, @bluemiaomiao @masazumi9527 Could you help provide more metrics-server logs?
From the previous log, the metrics-server is working normally
It just looks like APIService
is not accessible
Message: failing or missing response from https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1: Get "https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded (Client.Timeout exceeded while awaiting headers
@masazumi9527 hostnetwork: true
can solve your issue?
It does not work for me too. The metrics-server pods are running, I have set --kubelet-insecure-tls
flag, and this error: Metrics API not available
still shows when I do kubectl top node
or kubectl top pod
.
However, when I inquire using kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource
, it does show something like this:
# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797
...
...
It does not work for me too. The metrics-server pods are running, I have set
--kubelet-insecure-tls
flag, and thiserror: Metrics API not available
still shows when I dokubectl top node
orkubectl top pod
.However, when I inquire using
kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource
, it does show something like this:# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds # TYPE container_cpu_usage_seconds_total counter container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507 container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797 ... ...
Can you provide the logs of the metrics-server?
I also had the same problem:
Default containerPort is wrong in the latest release - https://github.com/kubernetes-sigs/metrics-server/issues/1236. Try overriding that to 10250
.
I encounter the same issue. Fresh install of Kubernetes, applying the https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
or deploying with helm with default values still gives these errors :
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: app.kubernetes.io/instance=metrics-server
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=metrics-server
app.kubernetes.io/version=0.6.4
helm.sh/chart=metrics-server-3.11.0
Annotations: meta.helm.sh/release-name: metrics-server
meta.helm.sh/release-namespace: default
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2023-09-22T20:01:52Z
Managed Fields:
API Version: apiregistration.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:meta.helm.sh/release-name:
f:meta.helm.sh/release-namespace:
f:labels:
.:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/managed-by:
f:app.kubernetes.io/name:
f:app.kubernetes.io/version:
f:helm.sh/chart:
f:spec:
f:group:
f:groupPriorityMinimum:
f:insecureSkipTLSVerify:
f:service:
.:
f:name:
f:namespace:
f:port:
f:version:
f:versionPriority:
Manager: helm
Operation: Update
Time: 2023-09-22T20:01:52Z
API Version: apiregistration.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:conditions:
.:
k:{"type":"Available"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
Manager: kube-apiserver
Operation: Update
Subresource: status
Time: 2023-09-23T18:42:54Z
Resource Version: 283501
UID: 2739dbe3-a6b0-4e50-a91a-dc7497af7658
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: default
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2023-09-22T20:01:53Z
Message: failing or missing response from https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1: Get "https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled whi
le waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
There is no error in the container :
I0922 20:02:29.383074 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0922 20:02:31.376931 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0922 20:02:31.376974 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0922 20:02:31.377020 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0922 20:02:31.377033 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.377063 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0922 20:02:31.377069 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.377415 1 secure_serving.go:267] Serving securely on :10250
I0922 20:02:31.377452 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0922 20:02:31.377869 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0922 20:02:31.378482 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0922 20:02:31.477787 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.477820 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.477907 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
Stream closed EOF for default/metrics-server-76c55fc4fc-5hdpv (metrics-server)
did you check this step, edit metric server file? link : https://www.youtube.com/watch?v=0UDG52REs68
Hi @henzbnzr,
Thank you.
But, first, I don't want to use the --kubelet-insecure-tls
option.
Second, this option should go in the args
section, not in the command
one.
Third, even in trying this option, I still have the unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication"
error.
The extension-apiserver-authentication
config map exists :
NAMESPACE↑ NAME
default kube-root-ca.crt
kube-node-lease kube-root-ca.crt
kube-public cluster-info
kube-public kube-root-ca.crt
kube-system cilium-config
kube-system coredns
kube-system extension-apiserver-authentication
kube-system kube-apiserver-legacy-service-account-token-tracking
kube-system kube-proxy
kube-system kube-root-ca.crt
kube-system kubeadm-config
kube-system kubelet-config
However, I don't know if it's a problem, but the error says request-header-client-ca-file
and the config map contains a requestheader-client-ca-file
cert (there is a missing dash in the name).
I also have the same problem, which is still unresolved. Please help me, thank you My configuration snippet is as follows, adding hostNetwork: true and also adding - --kubelet-insecure-tls. The pod runs normally and no errors are reported. The log is as follows. However, when executing kubectl top pod, the error error: Metrics API not available is still reported. Please help me, thank you.
apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: hostNetwork: true containers:
[root@master metrics-server]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-66f779496c-cmndx 1/1 Running 0 19d coredns-66f779496c-ztmmb 1/1 Running 0 19d etcd-master 1/1 Running 10 (36d ago) 48d kube-apiserver-master 1/1 Running 10 (36d ago) 48d kube-controller-manager-master 1/1 Running 13 (19d ago) 48d kube-proxy-4rdfh 1/1 Running 10 (36d ago) 48d kube-proxy-lv2gq 1/1 Running 3 (19d ago) 48d kube-proxy-pzskd 1/1 Running 2 (36d ago) 47d kube-scheduler-master 1/1 Running 12 (19d ago) 48d metrics-server-59dc595f65-spbh7 1/1 Running 0 12m
[root@master metrics-server]# kubectl logs metrics-server-59dc595f65-spbh7 -n kube-system I1011 13:35:27.396842 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key) I1011 13:35:28.011295 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1011 13:35:28.011318 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1011 13:35:28.011389 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file" I1011 13:35:28.011406 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1011 13:35:28.011424 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file" I1011 13:35:28.011432 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1011 13:35:28.011509 1 secure_serving.go:267] Serving securely on [::]:4443 I1011 13:35:28.011544 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" I1011 13:35:28.011988 1 tlsconfig.go:240] "Starting DynamicServingCertificateController" W1011 13:35:28.012148 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed I1011 13:35:28.111522 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1011 13:35:28.111650 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1011 13:35:28.111672 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
[root@master metrics-server]# kubectl get apiservices | grep metrics v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 48s
[root@master metrics-server]# kubectl get apiservice v1beta1.metrics.k8s.io -o yaml apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"v1beta1.metrics.k8s.io"},"spec":{"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":{"name":"metrics-server","namespace":"kube-system"},"version":"v1beta1","versionPriority":100}} creationTimestamp: "2023-10-16T03:48:25Z" labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io resourceVersion: "9122129" uid: df0b8054-7456-4158-8b96-78b1dad148d0 spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system port: 443 version: v1beta1 versionPriority: 100 status: conditions:
[root@master metrics-server]# kubectl top node error: Metrics API not available
This worked for me, thanks to @NileshGule:
[deploy metrics server]
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
2. Open the file in editor mode:
```shell
$ k -n kube-system edit deploy metrics-server
Under the containers
section, add only the command part:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
Check if the metric-server
is running now:
$ k -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-9d57d8f49-d26pd 1/1 Running 3 25h
canal-5xf7z 2/2 Running 0 11m
canal-mgtxd 2/2 Running 0 11m
coredns-7cbb7cccb8-gpnp5 1/1 Running 0 25h
coredns-7cbb7cccb8-qqcs6 1/1 Running 0 25h
etcd-controlplane 1/1 Running 0 25h
kube-apiserver-controlplane 1/1 Running 2 25h
kube-controller-manager-controlplane 1/1 Running 2 25h
kube-proxy-mk759 1/1 Running 0 25h
kube-proxy-wmp2n 1/1 Running 0 25h
kube-scheduler-controlplane 1/1 Running 2 25h
metrics-server-678d4b775-gqb65 1/1 Running 0 48s
Now try the top
command:
$ k top node
controlplane $ k top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
controlplane 85m 8% 1211Mi 64%
node01 34m 3% 957Mi 50%
I am seeing same issue on 1.27 and 0.6.3 metrics server. I have opened https://github.com/kubernetes-sigs/metrics-server/issues/1352 for the same. I am able to run top node and top pod commands.
I had similar issue, metrics-server up & running where as top command is not working as expected, error says "error: Metrics API not available" with 1.28 version with pod n/w is Calico. My container runtime engine is cri-o, & K8S install by kubeadm ("v1.28.2) on ubuntu machines ( 5 node cluster) Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.3 Calico: v3.26.1 metrics-server:v0.6.4 Since its Calico networking plugin for my CNI, I just added below 2 lines in my metrics-server deployment with reference to https://datacenterdope.wordpress.com/2020/01/20/installing-kubernetes-metrics-server-with-kubeadm/
- --kubelet-insecure-tls ---> this is at spec.containers.args section hostNetwork: true ---> this is at spec.containers section After editing/adding with above two lines at metrics-server deployment, top command started working; because metrics-server pod started to communicating with API server, otherwise we may end-up see "Readiness Probe" failed for metrics-server deployment.
This worked for me, thanks to @NileshGule:
- Deploy metric server:
[deploy metrics server](https://gist.github.com/NileshGule/8f772cf04ea6ae9c76d3f3e9186165c2#deploy-metrics-server)
- Open the file in editor mode:
k -n kube-system edit deploy metrics-server
- Under the
containers
section, add only the command part:containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
- Check if the
metric-server
is running now:k -n kube-system get pods NAME READY STATUS RESTARTS AGE calico-kube-controllers-9d57d8f49-d26pd 1/1 Running 3 25h canal-5xf7z 2/2 Running 0 11m canal-mgtxd 2/2 Running 0 11m coredns-7cbb7cccb8-gpnp5 1/1 Running 0 25h coredns-7cbb7cccb8-qqcs6 1/1 Running 0 25h etcd-controlplane 1/1 Running 0 25h kube-apiserver-controlplane 1/1 Running 2 25h kube-controller-manager-controlplane 1/1 Running 2 25h kube-proxy-mk759 1/1 Running 0 25h kube-proxy-wmp2n 1/1 Running 0 25h kube-scheduler-controlplane 1/1 Running 2 25h metrics-server-678d4b775-gqb65 1/1 Running 0 48s
- Now try the
top
command:k top node controlplane $ k top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% controlplane 85m 8% 1211Mi 64% node01 34m 3% 957Mi 50%
Thanks so much for this! it works for me!
IMHO, if you ended by add the --kubelet-insecure-tls
to unlock your problem, that means that you don't resolve the root cause.
hi, I added new worker to rke2 v1.26.11, but metrics server not working for just new worker-03 in below command:
kubectl top nodes
NAME CPU CPU% MEMORY MEMORY%
worker-02 400m 25% 800Mi 37%
worker-03 <UNKNOWN> <UNKNOWN> <UNKNOWN> <UNKNOWN>
also result of below command not contain worker-03:
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
also i configure metrics server deployment with below:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
- --v=9
also there is same log in metrics-server pod:
I0108 14:09:01.842852 1 decode.go:86] "Failed getting complete node metric" node="worker-03" metric=&{StartTime:0001-01-01 00:00:00 +0000 UTC Timestamp:2024-01-08 14:08:59.764 +0000 UTC CumulativeCpuUsed:0 MemoryUsage:0}
So, please let me your any graceful advice.
I have the same error and got Readiness probe failed: HTTP probe failed with statuscode: 500
k version -o yaml
This worked for me,
kustomization.yaml
resources:
- https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml
patches:
- target:
kind: Deployment
labelSelector: "k8s-app=metrics-server"
patch: |-
- op: replace
path: /spec/template/spec/containers/0/args
value:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
- op: replace
path: /spec/template/spec/containers/0/ports
value:
- containerPort: 4443
name: https
protocol: TCP
I have the same error and got
Readiness probe failed: HTTP probe failed with statuscode: 500
k version -o yaml
I get the same problem, how to fix it?
I get the same problem, how to fix it?
I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.
spec: nodeName: \<your master node name> containers:
I get the same problem, how to fix it?
I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.
spec: nodeName:
containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
This works for me. Thank you!
I get the same problem, how to fix it?
I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works. spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
This works for me. Thank you!
Thanks, wonderful. This work for me to solve the problem.
I get the same problem, how to fix it?
I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.
spec: nodeName:
containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
Amazing solution, thanks for sharing. Working on Hetzner Cloud, Kubernetes 1.30.
I get the same problem, how to fix it?
I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.
spec: nodeName:
containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
What's the reason for this? Is there some documentation about why metrics-server needs to be on a control-plane node?
I struggled to solve the issue and documented the steps after finding the solution - https://computingforgeeks.com/fix-error-metrics-api-not-available-in-kubernetes/
with helm chart, this solved it
containerPort: 4443
hostNetwork:
enabled: true
defaultArgs:
- --cert-dir=/tmp
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --secure-port=4443
- --kubelet-insecure-tls
Verify your kube-api pods in kube-system are communicating properly with the API due to proxy issues... I had to add no_proxy to the /etc/kubernetes/manifests/kube-apiserver.yaml
to connect properly to the APIs and it fix my issue, I added 10.0.0.0/8 to cover my services subnet.
I used HELM Chart (replicas == 3 enable HA), also used the nodeSelector with node-role.kubernetes.io/control-plane: ""
as my masters are also workers, so the metrics servers are lock on control-plane nodes only.
Hope this will help you,
What happened:
What you expected to happen: Show metrics.
Anything else we need to know?: latest version metrics server yaml.
Environment:
Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.): Kubeadm on my local servers.
Container Network Setup (flannel, calico, etc.): Calico
Kubernetes version (use
kubectl version
):Metrics server logs:
Status of Metrics API: