kubernetes-sigs / metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
Apache License 2.0
5.76k stars 1.86k forks source link

Using a different API endpoint - kubeconfig parameter ignored #661

Open Sapd opened 3 years ago

Sapd commented 3 years ago

Hello

My managed kubernetes provider requires, that pods use a specific k8s API endpoint. E.g. https://providerurl:6443 As such I created a configmap, mounted it in the metrics-server pod and - like in the documentation stated (via --help paramater) - I used the --kubeconfig= parameter. However it seems to be ignored.

(Full config in the manifest spoiler)

      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubeconfig=/etc/kubeconfig/kubeconfig.conf

As you can see in the log-message, it trys to connect to 10.96.0.1, which is not the API endpoint which I specified.

Error of metrics server (full log at the bottom spoiler): ``` Error: unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": x509: cannot validate certificate for 10.96.0.1 because it doesn't contain any IP SANs ```

Funnily, even when providing a wrong filename like --kubeconfig=foobar.conf it does not state, that the file is missing. That's why I think, that it is not even considered.


Other applications (like the kubernetes dashboard) accept the custom endpoint with the same method successfully.

Environment:

spoiler for Metrics Server manifest: ``` apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: custom-kubeconfig namespace: kube-system data: kubeconfig.conf: | apiVersion: v1 clusters: - cluster: certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt server: https://nameomitted:6443 name: default contexts: - context: cluster: default user: default name: default current-context: default kind: Config preferences: {} users: - name: default user: tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --kubeconfig=/etc/kubeconfig/kubeconfig.conf image: k8s.gcr.io/metrics-server/metrics-server:v0.4.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS periodSeconds: 10 securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - name: custom-kubeconfig mountPath: /etc/kubeconfig - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - name: custom-kubeconfig configMap: name: custom-kubeconfig - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100 ```
spoiler for Metrics Server logs: ``` Error: unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": x509: cannot validate certificate for 10.96.0.1 because it doesn't contain any IP SANs Usage: [flags] Flags: --add_dir_header If true, adds the file directory to the header of the log messages --alsologtostderr log to standard error as well as files --authentication-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenreviews.authentication.k8s.io. --authentication-skip-lookup If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster. --authentication-token-webhook-cache-ttl duration The duration to cache responses from the webhook token authenticator. (default 10s) --authentication-tolerate-lookup-failure If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous. --authorization-always-allow-paths strings A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server. --authorization-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io. --authorization-webhook-cache-authorized-ttl duration The duration to cache 'authorized' responses from the webhook authorizer. (default 10s) --authorization-webhook-cache-unauthorized-ttl duration The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s) --bind-address ip The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces will be used. (default 0.0.0.0) --cert-dir string The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates") --client-ca-file string If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate. --contention-profiling Enable lock contention profiling, if profiling is enabled -h, --help help for this command --http2-max-streams-per-connection int The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default. --kubeconfig string The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster config) --kubelet-certificate-authority string Path to the CA to use to validate the Kubelet's serving certificates. --kubelet-client-certificate string Path to a client cert file for TLS. --kubelet-client-key string Path to a client key file for TLS. --kubelet-insecure-tls Do not verify CA of serving certificates presented by Kubelets. For testing purposes only. --kubelet-port int The port to use to connect to Kubelets. (default 10250) --kubelet-preferred-address-types strings The priority of node address types to use when determining which address to use to connect to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP]) --kubelet-use-node-status-port Use the port in the node status. Takes precedence over --kubelet-port flag. --log-flush-frequency duration Maximum number of seconds between log flushes (default 5s) --log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0) --log_dir string If non-empty, write log files in this directory --log_file string If non-empty, use this log file --log_file_max_size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800) --logtostderr log to standard error instead of files (default true) --metric-resolution duration The resolution at which metrics-server will retain metrics. (default 1m0s) --permit-port-sharing If true, SO_REUSEPORT will be used when binding the port, which allows more than one instance to bind on the same address and port. [default=false] --profiling Enable profiling via web interface host:port/debug/pprof/ (default true) --requestheader-allowed-names strings List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed. --requestheader-client-ca-file string Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests. --requestheader-extra-headers-prefix strings List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-]) --requestheader-group-headers strings List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group]) --requestheader-username-headers strings List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user]) --secure-port int The port on which to serve HTTPS with authentication and authorization. If 0, don't serve HTTPS at all. (default 443) --skip_headers If true, avoid header prefixes in the log messages --skip_log_headers If true, avoid headers when opening log files --stderrthreshold severity logs at or above this threshold go to stderr (default 2) --tls-cert-file string File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir. --tls-cipher-suites strings Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used. Preferred values: TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_256_GCM_SHA384. Insecure values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_RC4_128_SHA. --tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13 --tls-private-key-file string File containing the default x509 private key matching --tls-cert-file. --tls-sni-cert-key namedCertKey A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default []) -v, --v Level number for the log level verbosity --version Show version --vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging panic: unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": x509: cannot validate certificate for 10.96.0.1 because it doesn't contain any IP SANs goroutine 1 [running]: main.main() /go/src/sigs.k8s.io/metrics-server/cmd/metrics-server/metrics-server.go:39 +0xfc ```
Name:         v1beta1.metrics.k8s.io
Namespace:    
Labels:       k8s-app=metrics-server
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"v...
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2020-12-10T12:30:08Z
  Resource Version:    41804712
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  UID:                 60436ed4-4ab9-4d7f-b8d7-f78b5ac58f24
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2020-12-10T12:30:08Z
    Message:               endpoints for service/metrics-server in "kube-system" have no addresses with port name "https"
    Reason:                MissingEndpoints
    Status:                False
    Type:                  Available
Events:                    <none>

/kind bug

serathius commented 3 years ago

/triage accepted

serathius commented 3 years ago

Tried to reproduce the issue with --kubeconfig=foobar.conf, looks like it works as expected. Used image: k8s.gcr.io/metrics-server/metrics-server:v0.4.1

Logs

Error: unable to construct lister client config: stat foobar.conf: no such file or directory
....
serathius commented 3 years ago

Tried to use manifests provided by author of issue, got

I0221 18:00:47.126919       1 serving.go:325] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
E0221 18:00:47.580474       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://nameomitted:6443/api/v1/pods?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:47.580528       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://nameomitted:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:48.797669       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://nameomitted:6443/api/v1/pods?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:49.034148       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://nameomitted:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:51.315221       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://nameomitted:6443/api/v1/pods?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:51.608172       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://nameomitted:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:55.050419       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://nameomitted:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving
E0221 18:00:55.876104       1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://nameomitted:6443/api/v1/pods?limit=500&resourceVersion=0": dial tcp: lookup nameomitted on 10.96.0.10:53: server misbehaving

Which as expected failed to use invalid endpoint in kubeconfig

serathius commented 3 years ago

Don't see any traces of the problem described in issue. @Sapd can you provide more information about how exactly this happens? Are you sure you copied the correct manifests? and used exactly the same MS version?

Sapd commented 3 years ago

Tried to reproduce the issue with --kubeconfig=foobar.conf, looks like it works as expected. Used image: k8s.gcr.io/metrics-server/metrics-server:v0.4.1

Logs

Error: unable to construct lister client config: stat foobar.conf: no such file or directory
....

I think you propably get the error, because your API server is available under the default URL.

@Sapd can you provide more information about how exactly this happens? Are you sure you copied the correct manifests? and used exactly the same MS version?

Yes, I used the correct manifests, also double-checked by my collegues. The error happens, when the API server of the Kuberentes Cluster is not available on the default IP. But must be specified via kubeconfig.conf and server: server In the metrics server, there must be some place, which does not respect this config.

However from my side, the issue is resolved, as my provider now provides the API server also on the default IP - to ease configuration of some applications.

serathius commented 3 years ago

I think you propably get the error, because your API server is available under the default URL.

ok, I think I understand. In my case I deployed in cluster where default K8s credentials were pointing to correct Apiserver, that's why in my case checking request-header-client-ca-file succeeded. Thanks for explanation.

Great to hear that this no longer affects you, still I will try again to reproduce to make sure we fix it for other.

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

liangyuanpeng commented 3 years ago

I just found it.

Run command and just got the error.

go run .\cmd\metrics-server\metrics-server.go --kubeconfig D:\k8s\232\config
...

panic: failed to get delegated authentication kubeconfig: failed to get delegated authentication kubeconfig: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined

goroutine 1 [running]:
main.main()
        D:/repo/git/metrics-server/cmd/metrics-server/metrics-server.go:39 +0x105

It just ignore my kubeconfig when run options.go#ApiserverConfig()

o.Authentication.ApplyTo(&serverConfig.Authentication, serverConfig.SecureServing, nil)

And i providing a wrong filename like --kubeconfig=foobar.conf , got the same error.

Same problem when use --authentication-kubeconfig D:\k8s\232\config param.

liangyuanpeng commented 3 years ago

/remove-lifecycle stale

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

serathius commented 2 years ago

/lifecycle frozen

k8s-triage-robot commented 1 year ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

dgrisonnet commented 1 year ago

/assign @serathius /triage accepted

k8s-triage-robot commented 6 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted