Closed fosiul closed 3 years ago
this issue was resolved by changing
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get https://:https/readyz delay=0s timeout=1s period=10s #success=1 #failure=3 to :
Liveness: http-get https://:https/healthz delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get https://:https/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
but when using metrics-server:v0.3.6 , should it use /healthz by default ? is there any reason that I need to change manually?
The official manifests file is here https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml I think this is a bit different from what you use.
@fosiul Is it solved using official manifests?
MS 0.3.x is no longer supported.
What happened:
Every time I install Kubernetes with rke, Everything works but Metrics-Server goes into "CrashLoopBackOff". I have created at least 10 times in 2 different environment, no network issues, no iptables issues.
from google , people suggesting I need to add
Command: /metrics-server --kubelet-insecure-tls --kubelet-preferred-address-types=InternalIP
but Question is : 1) When I am using rke , why it does not add by default ( if this is the real issues) 2) what Do I need to do to add it ?
What you expected to happen: Metrics-Server should be in Running State
Anything else we need to know?:
Environment: [rke@rke19-master1 ~]$ kubectl version Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.10", GitCommit:"98d5dc5d36d34a7ee13368a7893dcb400ec4e566", GitTreeState:"clean", BuildDate:"2021-04-15T03:28:42Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.10", GitCommit:"98d5dc5d36d34a7ee13368a7893dcb400ec4e566", GitTreeState:"clean", BuildDate:"2021-04-15T03:20:25Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}
[rke@rke19-master1 ~]$ cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) [rke@rke19-master1 ~]$ rke version INFO[0000] Running RKE version: v1.2.8 Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.10", GitCommit:"98d5dc5d36d34a7ee13368a7893dcb400ec4e566", GitTreeState:"clean", BuildDate:"2021-04-15T03:20:25Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"} [rke@rke19-master1 ~]$
[rke@rke19-master1 ~]$ cat cluster.yml nodes:
address: 192.168.0.56 port: "22" role:
address: 192.168.0.57 port: "22" role:
address: 192.168.0.58 port: "22" role:
address: 192.168.0.59 port: "22" role:
address: 192.168.0.60 port: "22" role:
services: etcd: image: "" extra_args: {} extra_binds: [] extra_env: [] external_urls: [] ca_cert: "" cert: "" key: "" path: "" uid: 0 gid: 0 snapshot: null retention: "" creation: "" backup_config: null kube-api: image: "" extra_args: {} extra_binds: [] extra_env: [] service_cluster_ip_range: 10.43.0.0/16 service_node_port_range: "" pod_security_policy: false always_pull_images: false secrets_encryption_config: null audit_log: null admission_configuration: null event_rate_limit: null kube-controller: image: "" extra_args: node-monitor-period: '5s' node-monitor-grace-period: '20s' node-startup-grace-period: '30s' pod-eviction-timeout: '1m' concurrent-deployment-syncs: 5 concurrent-endpoint-syncs: 5 concurrent-gc-syncs: 20 concurrent-namespace-syncs: 10 concurrent-replicaset-syncs: 5 concurrent-service-syncs: 1 concurrent-serviceaccount-token-syncs: 5 deployment-controller-sync-period: 30s pvclaimbinder-sync-period: 15s extra_binds: [] extra_env: [] cluster_cidr: 10.42.0.0/16 service_cluster_ip_range: 10.43.0.0/16 scheduler: image: "" extra_args: {} extra_binds: [] extra_env: [] kubelet: image: "" extra_args: enforce-node-allocatable: 'pods' system-reserved: 'cpu=1,memory=1024Mi' kube-reserved: 'cpu=1,memory=2024Mi' eviction-hard: 'memory.available<500Mi,nodefs.available<10%,imagefs.available<15%,nodefs.inodesFree<5%' eviction-max-pod-grace-period: '30' eviction-pressure-transition-period: '30s' node-status-update-frequency: 10s global-housekeeping-interval: 1m0s housekeeping-interval: 10s runtime-request-timeout: 2m0s volume-stats-agg-period: 1m0s extra_binds: [] extra_env: [] cluster_domain: cluster.local infra_container_image: "" cluster_dns_server: 10.43.0.10 fail_swap_on: false kubeproxy: image: "" extra_args: {} extra_binds: [] extra_env: [] network: plugin: canal options: {} mtu: 0 node_selector: {} authentication: strategy: x509 sans: [] webhook: null addons: "" addons_include: [] system_images: etcd: 192.168.0.35:5000/rancher/coreos-etcd:v3.4.13-rancher1 alpine: 192.168.0.35:5000/rancher/rke-tools:v0.1.68 nginx_proxy: 192.168.0.35:5000/rancher/rke-tools:v0.1.68 cert_downloader: 192.168.0.35:5000/rancher/rke-tools:v0.1.68 kubernetes_services_sidecar: 192.168.0.35:5000/rancher/rke-tools:v0.1.68 kubedns: 192.168.0.35:5000/rancher/k8s-dns-kube-dns:1.15.10 dnsmasq: 192.168.0.35:5000/rancher/k8s-dns-dnsmasq-nanny:1.15.10 kubedns_sidecar: 192.168.0.35:5000/rancher/k8s-dns-sidecar:1.15.10 kubedns_autoscaler: 192.168.0.35:5000/rancher/cluster-proportional-autoscaler:1.8.1 coredns: 192.168.0.35:5000/rancher/coredns-coredns:1.7.0 coredns_autoscaler: 192.168.0.35:5000/rancher/cluster-proportional-autoscaler:1.8.1 nodelocal: 192.168.0.35:5000/rancher/k8s-dns-node-cache:1.15.13 kubernetes: 192.168.0.35:5000/rancher/hyperkube:v1.19.10-rancher1 flannel: 192.168.0.35:5000/rancher/coreos-flannel:v0.13.0-rancher1 flannel_cni: 192.168.0.35:5000/rancher/flannel-cni:v0.3.0-rancher6 calico_node: 192.168.0.35:5000/rancher/calico-node:v3.16.5 calico_cni: 192.168.0.35:5000/rancher/calico-cni:v3.16.5 calico_controllers: 192.168.0.35:5000/rancher/calico-kube-controllers:v3.16.5 calico_ctl: 192.168.0.35:5000/rancher/calico-ctl:v3.16.5 calico_flexvol: 192.168.0.35:5000/rancher/calico-pod2daemon-flexvol:v3.16.5 canal_node: 192.168.0.35:5000/rancher/calico-node:v3.16.5 canal_cni: 192.168.0.35:5000/rancher/calico-cni:v3.16.5 canal_controllers: 192.168.0.35:5000/rancher/calico-kube-controllers:v3.16.5 canal_flannel: 192.168.0.35:5000/rancher/coreos-flannel:v0.13.0-rancher1 canal_flexvol: 192.168.0.35:5000/rancher/calico-pod2daemon-flexvol:v3.16.5 weave_node: 192.168.0.35:5000/weaveworks/weave-kube:2.7.0 weave_cni: 192.168.0.35:5000/weaveworks/weave-npc:2.7.0 pod_infra_container: 192.168.0.35:5000/rancher/pause:3.2 ingress: 192.168.0.35:5000/rancher/nginx-ingress-controller:nginx-0.35.0-rancher2 ingress_backend: 192.168.0.35:5000/rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1 metrics_server: 192.168.0.35:5000/rancher/metrics-server:v0.3.6 windows_pod_infra_container: 192.168.0.35:5000/rancher/kubelet-pause:v0.1.4 aci_cni_deploy_container: 192.168.0.35:5000/noiro/cnideploy:5.1.1.0.1ae238a aci_host_container: 192.168.0.35:5000/noiro/aci-containers-host:5.1.1.0.1ae238a aci_opflex_container: 192.168.0.35:5000/noiro/opflex:5.1.1.0.1ae238a aci_mcast_container: 192.168.0.35:5000/noiro/opflex:5.1.1.0.1ae238a aci_ovs_container: 192.168.0.35:5000/noiro/openvswitch:5.1.1.0.1ae238a aci_controller_container: 192.168.0.35:5000/noiro/aci-containers-controller:5.1.1.0.1ae238a aci_gbp_server_container: 192.168.0.35:5000/noiro/gbp-server:5.1.1.0.1ae238a aci_opflex_server_container: 192.168.0.35:5000/noiro/opflex-server:5.1.1.0.1ae238a ssh_key_path: ~/.ssh/id_rsa ssh_cert_path: "" ssh_agent_auth: false authorization: mode: rbac options: {} ignore_docker_version: false kubernetes_version: "" private_registries:
new
dns_policy: ClusterFirst extra_envs: [] extra_volumes: [] extra_volume_mounts: [] update_strategy: null cluster_name: "" prefix_path: "" addon_job_timeout: 120 bastion_host: address: "" port: "" user: "" ssh_key: "" ssh_key_path: "" ssh_cert: "" ssh_cert_path: "" monitoring: provider: "" options: {}
new
node_selector: {} restore: restore: false snapshot_name: "" dns: null
spoiler for Metrics Server manifest:
kubectl describe pods metrics-server-5b6d79d4f4-ggl57 -n kube-system ``` Name: metrics-server-5b6d79d4f4-ggl57 Namespace: kube-system Priority: 2000000000 Priority Class Name: system-cluster-critical Node: 192.168.0.59/192.168.0.59 Start Time: Tue, 17 Aug 2021 00:00:43 +0100 Labels: k8s-app=metrics-server pod-template-hash=5b6d79d4f4 Annotations: cni.projectcalico.org/podIP: 10.42.4.3/32 cni.projectcalico.org/podIPs: 10.42.4.3/32 Status: Running IP: 10.42.4.3 IPs: IP: 10.42.4.3 Controlled By: ReplicaSet/metrics-server-5b6d79d4f4 Containers: metrics-server: Container ID: docker://3d3fdc3746637ffff60afd26f711a44a5ead8d4e4156741e4267b207fc04b08a Image: 192.168.0.35:5000/rancher/metrics-server:v0.3.6 Image ID: docker-pullable://192.168.0.35:5000/rancher/metrics-server@sha256:c9c4e95068b51d6b33a9dccc61875df07dc650abbf4ac1a19d58b4628f89288b Port: 4443/TCP Host Port: 0/TCP Args: --cert-dir=/tmp --secure-port=4443 --kubelet-insecure-tls --kubelet-preferred-address-types=InternalIP --logtostderr State: Running Started: Tue, 17 Aug 2021 07:31:36 +0100 Last State: Terminated Reason: Error Exit Code: 2 Started: Tue, 17 Aug 2021 07:25:58 +0100 Finished: Tue, 17 Aug 2021 07:26:27 +0100 Ready: False Restart Count: 152 Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get https://:https/readyz delay=0s timeout=1s period=10s #success=1 #failure=3 Environment:spoiler for Kubelet config:
[rke@rke19-master1 ~]$ cat kube_config_cluster.yml apiVersion: v1 kind: Config clusters: - cluster: api-version: v1 certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN3akNDQWFxZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFTTVJBd0RnWURWUVFERXdkcmRXSmwKTFdOaE1CNFhEVEl4TURneE5qSXlORGN6T1ZvWERUTXhNRGd4TkRJeU5EY3pPVm93RWpFUU1BNEdBMVVFQXhNSAphM1ZpWlMxallUQ0NBU0l3RFFZSktvWklodmNOQVFFQkJRQURnZ0VQQURDQ0FRb0NnZ0VCQU1oMFdYeFNoVmtjCjJEaEl2T1lNdm16UklFSGtxRGh2dnJJeEk2S1RoNGdvRkVtZUE4TnMzMTJoU1g4SkxOK2huSlI1UGhvT1g5eTcKUzczTUhlR2l4MktaL1lodFBzRmdxTFQ2NjE3T1RwcHRuZFhvQXlTRWROODV0MDg2MVhCNnRNdHhpc3QrVWtBdQpZWkNQWmtibGNYcHJRWEZHT044WklteHQ2TWltdyswOTFkd1FNMWh4MmgwdzljcExzaXVPS1VHWEh1NDNITXpqClFLZlZJZGRZZjJudmxCdzV1a3AyYlREOWp0bUdkY0I4c0RvQnE0aU9FQzd3cVhqQ25OZ2ZVRlFUc3oyMnZWaTMKd21iZ3VaWFkvTlRTdzM5aFRuanFhMHpRZG5zOHJ3NkxiVGo0My9EN3EyMVZMNFdxMGZXOUpubTl3cDJYVlRSYgo4OVk3ZndNdmhoOENBd0VBQWFNak1DRXdEZ1lEVlIwUEFRSC9CQVFEQWdLa01BOEdBMVVkRXdFQi93UUZNQU1CCkFmOHdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBSjZKeUd1WkpzdU03SXA5ZDh4cGY4RjJoeDBUYllrcTVhd20KL09PY0FPbFFudUlXM2lwb1YvdFhqUVFNTTNUdzdrNk9PcXlGSFo1bGdwWkFkVjBmN0Rla3NYaVoxOEprUDRobApiZnZsWEtrdWVkaGlnQnhGM2VFbitXcmRqWlBneFJKUDVXNzVRZFhaaXdDMFpsYktGWG9BNW96b1lKNUVqZnYyCktZek95MHgrM25acU5yT3BxU3JaSndLVWNhZUMyNVJQY2hLSkptK2JXUFVFVE1XS25PQVRvUUVac0QzS0cxc1MKbDg0QWJTbitBSUZpTU5NNkpoVThDcDNBQXQvRUhPQ1ZHYXZ4SzYrRkI2M3dVNXc4Q1YrYm9rT1Z4RGw3U3IvZwpiSkF3ZkJFK05NNS9KaTFKTWxPWGZpR0ZxczN6KzM5VWxQWHpvL0FZYTc4UnRhMENUdHc9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K server: "https://192.168.0.58:6443" name: "local" contexts: - context: cluster: "local" user: "kube-admin-local" name: "local" current-context: "local" users: - name: "kube-admin-local" user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM2VENDQWRHZ0F3SUJBZ0lJTFFDOWZROVM1Mm93RFFZSktvWklodmNOQVFFTEJRQXdFakVRTUE0R0ExVUUKQXhNSGEzVmlaUzFqWVRBZUZ3MHlNVEE0TVRZeU1qUTNNemxhRncwek1UQTRNVFF5TWpRM05ESmFNQzR4RnpBVgpCZ05WQkFvVERuTjVjM1JsYlRwdFlYTjBaWEp6TVJNd0VRWURWUVFERXdwcmRXSmxMV0ZrYldsdU1JSUJJakFOCkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXF1OGovUHo3akRVMjZXc1doaDRBNjk0OHR2SkUKNTAyT0pVYzQ3YThYeElsWWlNSkpwV0NYcHRSNnM0Uy9DdnMxM3pwYTk2S00vRk1yQ3JmTjJNem95VGh4eVlFZwpaL3NpYW9sMUZ2bk15VmZVM0xnaTc2d3F5VjVOY0VSUUx3Vnl0bGwzNEJSNC96YUowcmIrcHh4em84ZjRwckVqCm56RFBhZlJ3OXI2dUtqaWRrb0JFNTlIRmZzdjVhZyt2UDVleDRvYnBRT3gzYVN0TnpBcFBuZzltU041RG1LbGoKblZtVjNhT2VEZjVXZ05JY2JvWHRWUEt4cDFkWWJPOHI2dnZrNHFDdTc3bUtwc3FKWXJiMCtFdlZkdDFSb24zaQp6UTBTcDlhTEFzdThjN0xrY2hMNnBra2FlbVNMb29jTkpBMEVqK2ZnVXFVK29qRElRdTd3VkRRV0VRSURBUUFCCm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUhBd0l3RFFZSktvWkkKaHZjTkFRRUxCUUFEZ2dFQkFGdkNrN3FVUlhtRVVSQ2wyVUh0dmFacXhWM2I0blJvRXp3b2F5S0E0emFuS1V4SQpkOHl2MHpoaFA4ckM5cUhDZFo0eUNoaml1MHU3OUNLczNVcmhXSDY1MTdmWmxVTFZXNlNyM2xLcVdtSU9JcjAzCnp2OUpjdlJEYlJma3hWR1M1REtkYURUU25FU0JCVHNUNlpaTjYvVFBOdkg0a3pvSUh3cFFsQlVpTkZZWEFUbFQKM3FETjR4dEJ1Mk9oRU1lcEhBT00wbDRvWlhINVJPQXI1Q1MrVFU2eTREcmdIZjRDSElZN1MrS2RSdXFhb3MyZApUTFV6VXo3TkV1dFE2eVR4V2htR1N2NjRjN3U2QVpXQ1J3Y3VwVnRTcG04L0FUOEVTK2hPYnJYWElWRDh0VXl1CmpLdzd3TnVUc29yaGhQeldIQmFibVJSWW1HYlp3dzJsNFRGSFBWMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcGdJQkFBS0NBUUVBcXU4ai9QejdqRFUyNldzV2hoNEE2OTQ4dHZKRTUwMk9KVWM0N2E4WHhJbFlpTUpKCnBXQ1hwdFI2czRTL0N2czEzenBhOTZLTS9GTXJDcmZOMk16b3lUaHh5WUVnWi9zaWFvbDFGdm5NeVZmVTNMZ2kKNzZ3cXlWNU5jRVJRTHdWeXRsbDM0QlI0L3phSjByYitweHh6bzhmNHByRWpuekRQYWZSdzlyNnVLamlka29CRQo1OUhGZnN2NWFnK3ZQNWV4NG9icFFPeDNhU3ROekFwUG5nOW1TTjVEbUtsam5WbVYzYU9lRGY1V2dOSWNib1h0ClZQS3hwMWRZYk84cjZ2dms0cUN1NzdtS3BzcUpZcmIwK0V2VmR0MVJvbjNpelEwU3A5YUxBc3U4YzdMa2NoTDYKcGtrYWVtU0xvb2NOSkEwRWorZmdVcVUrb2pESVF1N3dWRFFXRVFJREFRQUJBb0lCQVFDYmx3M0ZER25VRitRaAoxODRxeWtqQWFnd040cnlCWm9ES3dlZTV3alQ2T3FLUjZYZXJ4eDZEUnNsaGVxV0MwMk1ZREVBZFJLTGNVci9OCkE3MmxaKzlFcWRJNVB3WkdYN3ZXQ2NUQTR5UmE2VTNpa3VHS0U4Ym1nS1l3V0o0OER0TjUxRHBmaDRNVG00c2MKZUdHWHJ6ZzdqcHh3N3JDa0NJUGp5QkxESnBIVjd6RXZYbXJSN0Z0UHFEek94WkZtNEFwd29MRTZUR2YyWWxTZgptbFFaUkhIL1Jha3RGQUNRNjZkNjJDQkNmZm45cXVjS2VPRjdIV29CV0JUazQwQ2tRcHZYV29UTG9GWXYzK0h1CmtpbTR6Ly9BU2V6MnhyRVhSUFpaekZLcHBwOHVxV25qbkl3QmxmQVB0L3F1QS9QNWV1R1RKcUowZk5icHJMcHkKYUw2UTRHeWhBb0dCQU1RUkI4RGIxVHp2Qmxpam8xU3NmejUrUTNwRzN6ZUNHcG5EYWZ6WWE5QnFCMHlKVEZxcQppUE02L0NxQ1VhS1VlQi9CRmN3VGJkUTY5WFhzL1dyWU5mWXZqc1Jjc3BXNzhOVVNJdnNSYU4vK09Ya1FQSmtOCkUxVFRxUitxa3hKbWdZUTdGM1h6V05BWUJSSnMwTHhVbHVia1hENkRiQ0FDNTIwOHcxb2lZNFpsQW9HQkFOOHYKWjBvTkd4TU4xcWVFVVJPcmdla3dKdmNDcFEyUG1LMDZVUmFEZzdxb1hiQVRyTzhjTjdScXRvSnM4blFSaTYvbwpJdVNyZGlxVWFZeFhXTkdkVWdWWmlFby9HWFFZeVJBbUtVcFdBbmJKbURwZzBjV2dRd1BHamJkM1FGZWlPeXFlCmcraml3UG1XUmV1eU9IVUdGbURtTlVwZkV3Z1hEN3JkR0toSjc5QTlBb0dCQUk5NXZ1aThkZUN2TVQrd0Q1ZW8KMnp5Si9Tci9yZHphMGtodkhhSXZaVVlRTU9NckhickRUSkJoTzZLSDF1RllNRWRjYm16MlVzcVprb0lIT0xMMQpJUmZVV1c4TVBvc2dDdTZBNVNSQTZ6UHV2M1ArRTdvVVBXODNyRzFGejNZSm1RR0FsSHgxNVNueVNkUGYyU2ZYCjVzMXprcVVVV3cxWjBxeTNhR1VQQVRHWkFvR0JBTUEwbXNkek1mWGUzUlczSmZ2Q29FYXFhV1FncXZSYXppbWgKSjJRMExxWDVpWFd4L0NTUU1JajN2ZVhrM1lpSDg3eXlOaHFvYjBPTVBMbllIMjJtQnBVRTNoTFM5S0MvRjZrSQp0RmFJYStiUkJvQ0FFU2daTkoxenlXaFBFdUpsbkg2L3RPcERIZDNVUkxNTzhRQVhGZjZ0UXdlaGlVcFdVZjJqCm16Q1RQQ3doQW9HQkFMVEVYYUhad1JhK2JCa2lhN3lrUVN1QzI3elM3ZEFRYnMrMXVzbnp4a0ExQUNTcFQwdm0KY2lqanRnR0JWQkZaeTBRazlONERkUU1oOFBvTXJ2NnJUWnZ6NHNva3c1VG9HdGdjcjJZdHlBMDVrTDBUYzhPKwpNbmIxOFZBb1pGV0V3R2xJZHJrR3BqWW9SYzNTQ0xwNFJZMlpmeFFFQkFlalZRQng5aUNlWlN5bAotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=[rke@rke19-master1 ~]$spoiler for Metrics Server logs:
[rke@rke19-master1 ~]$ kubectl logs metrics-server-5b6d79d4f4-ggl57 -n kube-system I0817 06:56:00.095688 1 secure_serving.go:116] Serving securely on [::]:4443Status of Metrics API:
spolier for Status of Metrics API:
``` kubectl describe apiservice v1beta1.metrics.k8s.io ``` [rke@rke19-master1 ~]$ kubectl describe apiservice v1beta1.metrics.k8s.io Name: v1beta1.metrics.k8s.io Namespace: Labels: k8s-app=metrics-server Annotations:/kind bug