kubernetes-sigs / metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
Apache License 2.0
5.72k stars 1.86k forks source link

Document securing connection between Metrics Server <-> Kubelet #576

Open thanos1983 opened 4 years ago

thanos1983 commented 4 years ago

What would you like to be added: Analytical steps for begginers how to configure TLS from Master to Workers

Why is this needed: All the tickets that I have found e.g. x509: certificate signed by unknown authority metrics-server or Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP #131 all use the --kubelet-insecure-tls flag.

I have spend 2 days now trying to figure out how to set it up but with no luck so far.

I think it would be a good addition as a tutorial with analytical steps.

/kind feature

MatthewPattell commented 3 years ago

Same question! I try this and this

I spend 3 days. Result:

 x509: certificate signed by unknown authority
thanos1983 commented 3 years ago

Same question! I try this and this

I spend 3 days. Result:

 x509: certificate signed by unknown authority

Hello @MatthewPattell ,

I downloaded the latest patch and seems the problem to be fixed. I am also running calico as network element (I do not know if this affects the solution but just keep it in mind).

Give it a try and let us if this works for you as well.

BR / Thanos

MatthewPattell commented 3 years ago

@thanos1983 could you explain me more about how download latest patch? I try to use container image: gcr.io/k8s-staging-metrics-server/metrics-server:master, but it still not working:( Part of my deployment:

      volumes:
        - name: tmp-dir
          emptyDir: {}
        - configMap:
            defaultMode: 420
            name: ca-certs
          name: ca-dir
      containers:
        - name: metrics-server
          image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
          imagePullPolicy: IfNotPresent
          args:
            - --cert-dir=/tmp
            - --secure-port=4443
#            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
            - --kubelet-certificate-authority=/ca/ca.crt
            - --tls-cert-file=/ca/apiserver.crt
            - --tls-private-key-file=/ca/apiserver.key
          ports:
            - name: main-port
              containerPort: 4443
              protocol: TCP
          securityContext:
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - name: tmp-dir
              mountPath: /tmp
            - mountPath: /ca
              name: ca-dir

My logs:

server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]
thanos1983 commented 3 years ago

@thanos1983 could you explain me more about how download latest patch? I try to use container image: gcr.io/k8s-staging-metrics-server/metrics-server:master, but it still not working:( Part of my deployment:

      volumes:
        - name: tmp-dir
          emptyDir: {}
        - configMap:
            defaultMode: 420
            name: ca-certs
          name: ca-dir
      containers:
        - name: metrics-server
          image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
          imagePullPolicy: IfNotPresent
          args:
            - --cert-dir=/tmp
            - --secure-port=4443
#            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
            - --kubelet-certificate-authority=/ca/ca.crt
            - --tls-cert-file=/ca/apiserver.crt
            - --tls-private-key-file=/ca/apiserver.key
          ports:
            - name: main-port
              containerPort: 4443
              protocol: TCP
          securityContext:
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - name: tmp-dir
              mountPath: /tmp
            - mountPath: /ca
              name: ca-dir

My logs:

server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node kubernetes.sample.com: unable to fetch metrics from node kubernetes.sample.com: Get "https://kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority, unable to fully scrape metrics from node master-kubernetes.sample.com: unable to fetch metrics from node master-kubernetes.sample.com: Get "https://master-kubernetes.sample.com:10250/stats/summary?only_cpu_and_memory=true": x509: certificate signed by unknown authority]

Hello @MatthewPattell ,

The new version 0.3.7 seems to be working fine for me out of the box. I do not need to pass all those parameters:

args:
  - --cert-dir=/tmp
  - --secure-port=4443
#            - --kubelet-insecure-tls
  - --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP
  - --kubelet-certificate-authority=/ca/ca.crt
  - --tls-cert-file=/ca/apiserver.crt
  - --tls-private-key-file=/ca/apiserver.key
- mountPath: /ca # also this
   name: ca-dir

The file is as it comes by default sample:

args:
  - --cert-dir=/tmp
  - --secure-port=4443

What version of kubectl are you running?

BR / Thanos

MatthewPattell commented 3 years ago

@thanos1983 my kubernetes version:

kubelet --version
Kubernetes v1.19.3

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
thanos1983 commented 3 years ago

@thanos1983 my kubernetes version:

kubelet --version
Kubernetes v1.19.3

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

I am not very experienced with kubernetes to the extend that I can give advices but I am not sure if you should have different version of client / server. Take a look on this it might end up as a problem in future.

Regarding the image did it worked for you after the proposed configurations?

MatthewPattell commented 3 years ago

@thanos1983 I am not think different version of client it is problem. I was try proposed configurations, it not work for me(

serathius commented 3 years ago

TLS configuration depends on how Kubernetes distribution your using has set it's default and what options you overwrote (no impact of K8s version or Metrics Server version). Some distribution use self signed certificates in Kubelet, some use separate CA then apiserver and on some TLS for metrics server works out of the box.

I don't think Metrics Server documentation can do anything better then asking it's users to understand how CA is configured in their cluster and adapt their configuration accordingly. Trying to document how to fix those problems would require separate documentation per K8s distribution, which would not be maintainable.

Currently we try to list requirements that users should take a look into https://github.com/kubernetes-sigs/metrics-server#requirements. Users should look into documentation of k8s distribution to find how they can configure their cluster to fulfill those requirements

Please let me know if you have any ideas on how we can improve it and please let's not do debugging in feature request issues.

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

serathius commented 3 years ago

/remove-lifecycle stale

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

serathius commented 3 years ago

/remove-lifecycle stale

serathius commented 3 years ago

/remove-lifecycle frozen

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

serathius commented 2 years ago

/remove-lifecycle stale /lifecycle frozen

avoidik commented 2 years ago

would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.

https://github.com/kubernetes/kubernetes/blob/v1.23.1/staging/src/k8s.io/client-go/util/certificate/certificate_store.go#L196

serathius commented 2 years ago

@avoidik that's a good question, I think it would be good to create a list of steps required to secure Metrics Server and verify it on some popular K8s distro.

cc @yangjunmyfm192085 @dgrisonnet

would like to know it as well, I'm wondering whether a clean approach is even possible, the kubelet generated certificates will have 0600 permissions, so only the user running kubelet daemon will be able to read them.

Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed.

avoidik commented 2 years ago

I just wanted to reuse the same certificates specifically issued for/by kubelet, but it seems I would need to have another pair of certificates for metrics-server

yangjunmyfm192085 commented 2 years ago

Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed, Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed. But if you do want to configure the certificate manually, there is the issue you discussed. My question is, do you really need to configure the certificate manually? not recommended to do this. /cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?

sanwishe commented 2 years ago

Generally, we do not need to add flags manually to configure the certificate required by mertrics-server to access kubelet. as @serathius discussed, Proper K8s setup, certificates served by Kubelet are signed by cluster main CA. Metrics Sever doesn't need to access to them. When creating a TLS connection to Kubelet it should be able to confirm that served certificates are properly signed. But if you do want to configure the certificate manually, there is the issue you discussed. My question is, do you really need to configure the certificate manually? not recommended to do this. /cc @sanwishe , Can you help to analyze it, if the metrics-server pod accesses the certificate, what permissions are required?

ok,i am working on this.

mprimeaux commented 1 year ago

Has any progress been made, here or elsewhere, since @sanwishe commented on January 19?

shellwhale commented 1 year ago

I myself cannot make this work myself without using --kubelete-insecure-tls

This is what shows up in the API server logs.

2023-04-16 21:18:10.865Z E0416 21:18:10.865645       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: error trying to reach service: x509: certificate signed by unknown authority

I created a root certificate authority and placed it in/etc/kubernetes/pki/ca.crt and a key /etc/kubernetes/pki/ca.key

I then run kubeadm init --config kubeadm-init.yml kubeadm-init.yml

apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
bootstrapTokens:
  - token: "<my token>"
    description: "kubeadm bootstrap token"
    ttl: "1h"
    groups: 
      - system:bootstrappers:kubeadm:default-node-token
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.26.0
certificatesDir: /etc/kubernetes/pki
networking:
  podSubnet: 10.244.0.0/16
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true

I created a metrics-server.crt and metrics-server.key that I mount on the metrics server pod and run with the following arguments. Note that --cert-dir=/tmp and --kubelet-insecure-tls are commented.

      - args:
        # - --cert-dir=/tmp
        - --client-ca-file=/etc/kubernetes/pki/ca.crt
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        # - --kubelet-insecure-tls
        - --kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt
        - --tls-cert-file=/etc/kubernetes/pki/metrics-server.crt
        - --tls-private-key-file=/etc/kubernetes/pki/metrics-server.key

Here is the metrics-server csr configuration (cfssl) :

{
    "CN": "metrics-server",
    "hosts": [
        "metrics-server.kube-system",
        "metrics-server.kube-system.svc",
        "metrics-server.kube-system.svc.cluster.local",
        "localhost"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    }
}

I don't understand why I'm getting Body: error trying to reach service: x509: certificate signed by unknown authority in the api server. I don't when running certigo (after installating the certificate to /usr/local/share/ca-certificates/)

./certigo connect metrics-server.kube-system.svc.cluster.local:443 # no issues with this

Which process is reaching the metrics server? The api server directly? If so, which certificate is it complaining about? The one sent by the metrics server to the api server when it acts as a client? Or the other way around? We need docs about this, maybe some network schema that shows the certificates on it.

5n00p4eg commented 9 months ago

Hi there. I want to share my story. TL;DR: I did it xD

I have a bare-metal cluster with v1.28.2 k8s version. It was set up using kubeadm. My goal was to set up MetalLB, which requires metrics API on k8s. I found that it has 2 main solutions MS and Prometheus-adapter. I choose 1st as a simpler one.

After applying the helm chart I saw the mentioned error. After trying some random stuff I want to check https certs by myself and even verify them.

I tried to verify kubelet HTTP server cert with CA taken from cm

kubectl get cm -n kube-system extension-apiserver-authentication -o json | jq -r ".data[\"client-ca-file\"]" | openssl x509 > ../client-ca.pem

openssl verify -verbose -CAfile client-ca.pem  master-4-cluster1682686205-chain.pem
verification failed

Then I found this

A kubelet also can use serving certificates. The kubelet itself exposes an https endpoint for certain features. To secure these, the kubelet can do one of:

  • use provided key and certificate, via the --tls-private-key-file and --tls-cert-file flags
  • create self-signed key and certificate, if a key and certificate are not provided
  • request serving certificates from the cluster server, via the CSR API The client certificate provided by TLS bootstrapping is signed, by default, for client auth only, and thus cannot be used as serving certificates, or server auth.

However, you can enable its server certificate, at least partially, via certificate rotation.

So I updated kubelet cm kubectl edit cm -n kube-system kubelet-config

...
serverTLSBootstrap: true
...

and updated all the nodes with sudo kubeadm upgrade node phase kubelet-config sudo systemctl restart kubelet.service

After that found new cert requests k get csr -n kube-system

I approved all of them using kubectl certificate approve ... command, and then....

... magic happened, MS reached node metrics endpoints with SSL certs signed by CA that MS took from extension-apiserver-authentication config.