kubernetes-sigs / metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
Apache License 2.0
5.6k stars 1.84k forks source link

error: Metrics API not available #1282

Open bluemiaomiao opened 1 year ago

bluemiaomiao commented 1 year ago

What happened:

kubectl get pods --all-namespaces 
NAMESPACE     NAME                                       READY   STATUS             RESTARTS       AGE
kube-system   calico-kube-controllers-85578c44bf-526bd   1/1     Running            0              89m
kube-system   calico-node-4x7zk                          1/1     Running            0              80m
kube-system   calico-node-6bfnp                          1/1     Running            5 (84m ago)    119m
kube-system   calico-node-79tnt                          1/1     Running            0              71m
kube-system   calico-node-h99hx                          1/1     Running            0              82m
kube-system   calico-node-r4dk4                          1/1     Running            0              83m
kube-system   calico-typha-866bf4ccff-xb4kl              1/1     Running            0              89m
kube-system   coredns-5d78c9869d-gbhnw                   0/1     CrashLoopBackOff   39 (10s ago)   159m
kube-system   coredns-5d78c9869d-zklwl                   0/1     CrashLoopBackOff   39 (16s ago)   159m
kube-system   etcd-k0.xlab.io                            1/1     Running            2              159m
kube-system   kube-apiserver-k0.xlab.io                  1/1     Running            0              159m
kube-system   kube-controller-manager-k0.xlab.io         1/1     Running            0              159m
kube-system   kube-proxy-8wrl7                           1/1     Running            0              71m
kube-system   kube-proxy-9d5xs                           1/1     Running            0              82m
kube-system   kube-proxy-ksq4n                           1/1     Running            0              83m
kube-system   kube-proxy-r926v                           1/1     Running            0              159m
kube-system   kube-proxy-w954b                           1/1     Running            0              80m
kube-system   kube-scheduler-k0.xlab.io                  1/1     Running            0              159m
kube-system   metrics-server-7866664974-bzt4j            1/1     Running            0              2m29s
kubectl apply -f metrics-server.yaml
kubectl top node
error: Metrics API not available

What you expected to happen: Show metrics.

Anything else we need to know?: latest version metrics server yaml.

Environment:

kubectl version -o yaml
clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-14T09:47:40Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls=true
        image: registry.k8s.io/metrics-server/metrics-server:v0.6.3
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data:<MyKey>
    server: https://k0.xlab.io:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: system:node:k0.xlab.io
  name: system:node:k0.xlab.io@kubernetes
current-context: system:node:k0.xlab.io@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:k0.xlab.io
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
    client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

Name:         v1beta1.metrics.k8s.io
Namespace:    
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2023-07-01T16:17:04Z
  Resource Version:    20032
  UID:                 bb670fc2-666f-4617-ac5b-4405bbb2328c
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2023-07-01T16:17:04Z
    Message:               failing or missing response from https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1: Get "https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

/kind bug
bluemiaomiao commented 1 year ago

Help me pls!!

yangjunmyfm192085 commented 1 year ago

Hi, @bluemiaomiao ,Has the issue been solved? metrics-server needs two scripe cycles to provide metrics. If kubectl top node has no metrics after two scripe cycles, please contine to provide the logs of metrics-srever

bluemiaomiao commented 1 year ago

@yangjunmyfm192085 No indicators have been provided yet, and I haven’t investigated what happened internally.

masazumi9527 commented 11 months ago

I have almost similar problem! I use the helm chart to install kube-metrics in one master and one worker, kubectl top node doesn't work like @bluemiaomiao.

hostnetwork: true in values.yaml is useful, but why metrics-server in CNI doesn't work?

dashpole commented 11 months ago

/assign @yangjunmyfm192085 /triage accepted

yangjunmyfm192085 commented 11 months ago

Hi, @bluemiaomiao @masazumi9527 Could you help provide more metrics-server logs? From the previous log, the metrics-server is working normally It just looks like APIService is not accessible Message: failing or missing response from https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1: Get "https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded (Client.Timeout exceeded while awaiting headers

@masazumi9527 hostnetwork: true can solve your issue?

aws-apradana commented 11 months ago

It does not work for me too. The metrics-server pods are running, I have set --kubelet-insecure-tls flag, and this error: Metrics API not available still shows when I do kubectl top node or kubectl top pod.

However, when I inquire using kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource, it does show something like this:

# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797
...
...
yangjunmyfm192085 commented 11 months ago

It does not work for me too. The metrics-server pods are running, I have set --kubelet-insecure-tls flag, and this error: Metrics API not available still shows when I do kubectl top node or kubectl top pod.

However, when I inquire using kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource, it does show something like this:

# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797
...
...

Can you provide the logs of the metrics-server?

Kikyo-chan commented 11 months ago

I also had the same problem:

image

image image

henzbnzr commented 10 months ago

https://www.youtube.com/watch?v=0UDG52REs68

brosef commented 10 months ago

Default containerPort is wrong in the latest release - https://github.com/kubernetes-sigs/metrics-server/issues/1236. Try overriding that to 10250.

LeoShivas commented 9 months ago

I encounter the same issue. Fresh install of Kubernetes, applying the https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml or deploying with helm with default values still gives these errors :

Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       app.kubernetes.io/instance=metrics-server
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=metrics-server
              app.kubernetes.io/version=0.6.4
              helm.sh/chart=metrics-server-3.11.0
Annotations:  meta.helm.sh/release-name: metrics-server
              meta.helm.sh/release-namespace: default
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2023-09-22T20:01:52Z
  Managed Fields:
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:meta.helm.sh/release-name:
          f:meta.helm.sh/release-namespace:
        f:labels:
          .:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
          f:helm.sh/chart:
      f:spec:
        f:group:
        f:groupPriorityMinimum:
        f:insecureSkipTLSVerify:
        f:service:
          .:
          f:name:
          f:namespace:
          f:port:
        f:version:
        f:versionPriority:
    Manager:      helm
    Operation:    Update
    Time:         2023-09-22T20:01:52Z
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .:
          k:{"type":"Available"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
    Manager:         kube-apiserver
    Operation:       Update
    Subresource:     status
    Time:            2023-09-23T18:42:54Z
  Resource Version:  283501
  UID:               2739dbe3-a6b0-4e50-a91a-dc7497af7658
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       default
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2023-09-22T20:01:53Z
    Message:               failing or missing response from https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1: Get "https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled whi
le waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

There is no error in the container :

I0922 20:02:29.383074       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0922 20:02:31.376931       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0922 20:02:31.376974       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0922 20:02:31.377020       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0922 20:02:31.377033       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.377063       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0922 20:02:31.377069       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.377415       1 secure_serving.go:267] Serving securely on :10250
I0922 20:02:31.377452       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0922 20:02:31.377869       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0922 20:02:31.378482       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0922 20:02:31.477787       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.477820       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.477907       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
Stream closed EOF for default/metrics-server-76c55fc4fc-5hdpv (metrics-server)
henzbnzr commented 9 months ago

did you check this step, edit metric server file? link : https://www.youtube.com/watch?v=0UDG52REs68

LeoShivas commented 9 months ago

Hi @henzbnzr,

Thank you.

But, first, I don't want to use the --kubelet-insecure-tls option.

Second, this option should go in the args section, not in the command one.

Third, even in trying this option, I still have the unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication" error.

The extension-apiserver-authentication config map exists :

NAMESPACE↑                                    NAME                                                 
default                                       kube-root-ca.crt                                     
kube-node-lease                               kube-root-ca.crt                                     
kube-public                                   cluster-info                                         
kube-public                                   kube-root-ca.crt                                     
kube-system                                   cilium-config                                        
kube-system                                   coredns                                              
kube-system                                   extension-apiserver-authentication                   
kube-system                                   kube-apiserver-legacy-service-account-token-tracking 
kube-system                                   kube-proxy                                           
kube-system                                   kube-root-ca.crt                                     
kube-system                                   kubeadm-config                                       
kube-system                                   kubelet-config                                       

However, I don't know if it's a problem, but the error says request-header-client-ca-file and the config map contains a requestheader-client-ca-file cert (there is a missing dash in the name).

ata666 commented 9 months ago

I also have the same problem, which is still unresolved. Please help me, thank you My configuration snippet is as follows, adding hostNetwork: true and also adding - --kubelet-insecure-tls. The pod runs normally and no errors are reported. The log is as follows. However, when executing kubectl top pod, the error error: Metrics API not available is still reported. Please help me, thank you.

apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: hostNetwork: true containers:

[root@master metrics-server]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-66f779496c-cmndx 1/1 Running 0 19d coredns-66f779496c-ztmmb 1/1 Running 0 19d etcd-master 1/1 Running 10 (36d ago) 48d kube-apiserver-master 1/1 Running 10 (36d ago) 48d kube-controller-manager-master 1/1 Running 13 (19d ago) 48d kube-proxy-4rdfh 1/1 Running 10 (36d ago) 48d kube-proxy-lv2gq 1/1 Running 3 (19d ago) 48d kube-proxy-pzskd 1/1 Running 2 (36d ago) 47d kube-scheduler-master 1/1 Running 12 (19d ago) 48d metrics-server-59dc595f65-spbh7 1/1 Running 0 12m

[root@master metrics-server]# kubectl logs metrics-server-59dc595f65-spbh7 -n kube-system I1011 13:35:27.396842 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key) I1011 13:35:28.011295 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1011 13:35:28.011318 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1011 13:35:28.011389 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file" I1011 13:35:28.011406 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1011 13:35:28.011424 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file" I1011 13:35:28.011432 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1011 13:35:28.011509 1 secure_serving.go:267] Serving securely on [::]:4443 I1011 13:35:28.011544 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" I1011 13:35:28.011988 1 tlsconfig.go:240] "Starting DynamicServingCertificateController" W1011 13:35:28.012148 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed I1011 13:35:28.111522 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1011 13:35:28.111650 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1011 13:35:28.111672 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file

[root@master metrics-server]# kubectl get apiservices | grep metrics v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 48s

[root@master metrics-server]# kubectl get apiservice v1beta1.metrics.k8s.io -o yaml apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"v1beta1.metrics.k8s.io"},"spec":{"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":{"name":"metrics-server","namespace":"kube-system"},"version":"v1beta1","versionPriority":100}} creationTimestamp: "2023-10-16T03:48:25Z" labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io resourceVersion: "9122129" uid: df0b8054-7456-4158-8b96-78b1dad148d0 spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system port: 443 version: v1beta1 versionPriority: 100 status: conditions:

[root@master metrics-server]# kubectl top node error: Metrics API not available

AymenFJA commented 8 months ago

This worked for me, thanks to @NileshGule:

  1. Deploy metric server:
    
    [deploy metrics server] 

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml


2. Open the file in editor mode:
```shell
$ k -n kube-system edit deploy metrics-server
  1. Under the containers section, add only the command part:

      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
  2. Check if the metric-server is running now:

    $ k -n kube-system get pods
    NAME                                      READY   STATUS    RESTARTS   AGE
    calico-kube-controllers-9d57d8f49-d26pd   1/1     Running   3          25h
    canal-5xf7z                               2/2     Running   0          11m
    canal-mgtxd                               2/2     Running   0          11m
    coredns-7cbb7cccb8-gpnp5                  1/1     Running   0          25h
    coredns-7cbb7cccb8-qqcs6                  1/1     Running   0          25h
    etcd-controlplane                         1/1     Running   0          25h
    kube-apiserver-controlplane               1/1     Running   2          25h
    kube-controller-manager-controlplane      1/1     Running   2          25h
    kube-proxy-mk759                          1/1     Running   0          25h
    kube-proxy-wmp2n                          1/1     Running   0          25h
    kube-scheduler-controlplane               1/1     Running   2          25h
    metrics-server-678d4b775-gqb65            1/1     Running   0          48s
  3. Now try the top command:

    $ k top node
    controlplane $ k top node
    NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
    controlplane   85m          8%     1211Mi          64%       
    node01         34m          3%     957Mi           50%     
MahiraTechnology commented 8 months ago

I am seeing same issue on 1.27 and 0.6.3 metrics server. I have opened https://github.com/kubernetes-sigs/metrics-server/issues/1352 for the same. I am able to run top node and top pod commands.

jangalapallisr commented 8 months ago

I had similar issue, metrics-server up & running where as top command is not working as expected, error says "error: Metrics API not available" with 1.28 version with pod n/w is Calico. My container runtime engine is cri-o, & K8S install by kubeadm ("v1.28.2) on ubuntu machines ( 5 node cluster) Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.3 Calico: v3.26.1 metrics-server:v0.6.4 Since its Calico networking plugin for my CNI, I just added below 2 lines in my metrics-server deployment with reference to https://datacenterdope.wordpress.com/2020/01/20/installing-kubernetes-metrics-server-with-kubeadm/

- --kubelet-insecure-tls ---> this is at spec.containers.args section hostNetwork: true ---> this is at spec.containers section After editing/adding with above two lines at metrics-server deployment, top command started working; because metrics-server pod started to communicating with API server, otherwise we may end-up see "Readiness Probe" failed for metrics-server deployment.

image
theten52 commented 7 months ago

This worked for me, thanks to @NileshGule:

  1. Deploy metric server:
[deploy metrics server](https://gist.github.com/NileshGule/8f772cf04ea6ae9c76d3f3e9186165c2#deploy-metrics-server)
  1. Open the file in editor mode:
k -n kube-system edit deploy metrics-server
  1. Under the containers section, add only the command part:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
  1. Check if the metric-server is running now:
k -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-9d57d8f49-d26pd   1/1     Running   3          25h
canal-5xf7z                               2/2     Running   0          11m
canal-mgtxd                               2/2     Running   0          11m
coredns-7cbb7cccb8-gpnp5                  1/1     Running   0          25h
coredns-7cbb7cccb8-qqcs6                  1/1     Running   0          25h
etcd-controlplane                         1/1     Running   0          25h
kube-apiserver-controlplane               1/1     Running   2          25h
kube-controller-manager-controlplane      1/1     Running   2          25h
kube-proxy-mk759                          1/1     Running   0          25h
kube-proxy-wmp2n                          1/1     Running   0          25h
kube-scheduler-controlplane               1/1     Running   2          25h
metrics-server-678d4b775-gqb65            1/1     Running   0          48s
  1. Now try the top command:
 k top node
controlplane $ k top node
NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
controlplane   85m          8%     1211Mi          64%       
node01         34m          3%     957Mi           50%     

Thanks so much for this! it works for me!

LeoShivas commented 7 months ago

IMHO, if you ended by add the --kubelet-insecure-tls to unlock your problem, that means that you don't resolve the root cause.

mehdi-aghayari commented 6 months ago

hi, I added new worker to rke2 v1.26.11, but metrics server not working for just new worker-03 in below command:

kubectl top nodes
NAME         CPU CPU% MEMORY MEMORY%
worker-02  400m 25% 800Mi 37%
worker-03   <UNKNOWN> <UNKNOWN> <UNKNOWN> <UNKNOWN>

also result of below command not contain worker-03: kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes

also i configure metrics server deployment with below:

      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
        - --v=9

also there is same log in metrics-server pod: I0108 14:09:01.842852 1 decode.go:86] "Failed getting complete node metric" node="worker-03" metric=&{StartTime:0001-01-01 00:00:00 +0000 UTC Timestamp:2024-01-08 14:08:59.764 +0000 UTC CumulativeCpuUsed:0 MemoryUsage:0}

So, please let me your any graceful advice.

In stackoverflow

Tusenka commented 5 months ago

I have the same error and got Readiness probe failed: HTTP probe failed with statuscode: 500 image image image

image

k version -o yaml

image

davidwincent commented 3 months ago

This worked for me,

kustomization.yaml

resources:
  - https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml
patches:
- target:
    kind: Deployment
    labelSelector: "k8s-app=metrics-server"
  patch: |-
    - op: replace
      path: /spec/template/spec/containers/0/args
      value:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
    - op: replace
      path: /spec/template/spec/containers/0/ports
      value:
        - containerPort: 4443
          name: https
          protocol: TCP
HFourier commented 2 months ago

I have the same error and got Readiness probe failed: HTTP probe failed with statuscode: 500 image image image

image

k version -o yaml

image

I get the same problem, how to fix it?

HFourier commented 2 months ago

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: \<your master node name> containers:

tienhuynh17 commented 2 months ago

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

This works for me. Thank you!

chenjilan123 commented 1 week ago

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works. spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

This works for me. Thank you!

Thanks, wonderful. This work for me to solve the problem.

edernucci commented 4 hours ago

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

Amazing solution, thanks for sharing. Working on Hetzner Cloud, Kubernetes 1.30.