kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8.06k stars 3.97k forks source link

Bug: Unknown cloud provider when hetzner was specified #4037

Closed sergeyshevch closed 3 years ago

sergeyshevch commented 3 years ago

Which component are you using?: cluster-autoscaler

What version of the component are you using?: v1.17.4 / v.1.20.0

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-16T18:16:59Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:02:01Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?: hetzner cloud provider

What did you expect to happen?: cluster autoscaler not started

What happened instead?: cluster autoscaler starter

How to reproduce it (as minimally and precisely as possible):

Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
      # Node affinity is used to force cluster-autoscaler to stick
      # to the master node. This allows the cluster to reliably downscale
      # to zero worker nodes when needed.
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: Exists
      containers:
        - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.17.4
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --cloud-provider="hetzner"
            - --stderrthreshold=info
            - --nodes=1:10:CPX21:FSN1:worker
          env:
            - name: HCLOUD_TOKEN
              valueFrom:
                secretKeyRef:
                  name: hcloud-controller-hcloud-api-token
                  key: token
            - name: HCLOUD_CLOUD_INIT
              valueFrom:
                secretKeyRef:
                  key: cloudInit
                  name: cluster-autoscaler-cloud-init
            - name: HCLOUD_SSH_KEY
              value: 
            - name: HCLOUD_NETWORK
              value: 
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

Logs from pod
I0426 20:52:31.024900       1 flags.go:52] FLAG: --add-dir-header="false"
I0426 20:52:31.024997       1 flags.go:52] FLAG: --address=":8085"
I0426 20:52:31.025004       1 flags.go:52] FLAG: --alsologtostderr="false"
I0426 20:52:31.025009       1 flags.go:52] FLAG: --aws-use-static-instance-list="false"
I0426 20:52:31.025015       1 flags.go:52] FLAG: --balance-similar-node-groups="false"
I0426 20:52:31.025020       1 flags.go:52] FLAG: --cloud-config=""
I0426 20:52:31.025033       1 flags.go:52] FLAG: --cloud-provider="\"hetzner\""
I0426 20:52:31.025039       1 flags.go:52] FLAG: --cloud-provider-gce-l7lb-src-cidrs="130.211.0.0/22,35.191.0.0/16"
I0426 20:52:31.025047       1 flags.go:52] FLAG: --cloud-provider-gce-lb-src-cidrs="130.211.0.0/22,209.85.152.0/22,209.85.204.0/22,35.191.0.0/16"
I0426 20:52:31.025056       1 flags.go:52] FLAG: --cluster-name=""
I0426 20:52:31.025061       1 flags.go:52] FLAG: --clusterapi-cloud-config-authoritative="false"
I0426 20:52:31.025067       1 flags.go:52] FLAG: --cores-total="0:320000"
I0426 20:52:31.025072       1 flags.go:52] FLAG: --estimator="binpacking"
I0426 20:52:31.025078       1 flags.go:52] FLAG: --expander="random"
I0426 20:52:31.025083       1 flags.go:52] FLAG: --expendable-pods-priority-cutoff="-10"
I0426 20:52:31.025089       1 flags.go:52] FLAG: --filter-out-schedulable-pods-uses-packing="true"
I0426 20:52:31.025093       1 flags.go:52] FLAG: --gpu-total="[]"
I0426 20:52:31.025099       1 flags.go:52] FLAG: --ignore-daemonsets-utilization="false"
I0426 20:52:31.025104       1 flags.go:52] FLAG: --ignore-mirror-pods-utilization="false"
I0426 20:52:31.025109       1 flags.go:52] FLAG: --ignore-taint="[]"
I0426 20:52:31.025114       1 flags.go:52] FLAG: --kubeconfig=""
I0426 20:52:31.025119       1 flags.go:52] FLAG: --kubernetes=""
I0426 20:52:31.025123       1 flags.go:52] FLAG: --leader-elect="true"
I0426 20:52:31.025142       1 flags.go:52] FLAG: --leader-elect-lease-duration="15s"
I0426 20:52:31.025150       1 flags.go:52] FLAG: --leader-elect-renew-deadline="10s"
I0426 20:52:31.025155       1 flags.go:52] FLAG: --leader-elect-resource-lock="leases"
I0426 20:52:31.025173       1 flags.go:52] FLAG: --leader-elect-resource-name=""
I0426 20:52:31.025179       1 flags.go:52] FLAG: --leader-elect-resource-namespace=""
I0426 20:52:31.025184       1 flags.go:52] FLAG: --leader-elect-retry-period="2s"
I0426 20:52:31.025197       1 flags.go:52] FLAG: --log-backtrace-at=":0"
I0426 20:52:31.025207       1 flags.go:52] FLAG: --log-dir=""
I0426 20:52:31.025219       1 flags.go:52] FLAG: --log-file=""
I0426 20:52:31.025225       1 flags.go:52] FLAG: --log-file-max-size="1800"
I0426 20:52:31.025230       1 flags.go:52] FLAG: --logtostderr="true"
I0426 20:52:31.025271       1 flags.go:52] FLAG: --max-autoprovisioned-node-group-count="15"
I0426 20:52:31.025279       1 flags.go:52] FLAG: --max-bulk-soft-taint-count="10"
I0426 20:52:31.025287       1 flags.go:52] FLAG: --max-bulk-soft-taint-time="3s"
I0426 20:52:31.025293       1 flags.go:52] FLAG: --max-empty-bulk-delete="10"
I0426 20:52:31.025300       1 flags.go:52] FLAG: --max-failing-time="15m0s"
I0426 20:52:31.025308       1 flags.go:52] FLAG: --max-graceful-termination-sec="600"
I0426 20:52:31.025314       1 flags.go:52] FLAG: --max-inactivity="10m0s"
I0426 20:52:31.025332       1 flags.go:52] FLAG: --max-node-provision-time="15m0s"
I0426 20:52:31.025338       1 flags.go:52] FLAG: --max-nodes-total="0"
I0426 20:52:31.025343       1 flags.go:52] FLAG: --max-total-unready-percentage="45"
I0426 20:52:31.025357       1 flags.go:52] FLAG: --memory-total="0:6400000"
I0426 20:52:31.025363       1 flags.go:52] FLAG: --min-replica-count="0"
I0426 20:52:31.025368       1 flags.go:52] FLAG: --namespace="kube-system"
I0426 20:52:31.025382       1 flags.go:52] FLAG: --new-pod-scale-up-delay="0s"
I0426 20:52:31.025391       1 flags.go:52] FLAG: --node-autoprovisioning-enabled="false"
I0426 20:52:31.025403       1 flags.go:52] FLAG: --node-deletion-delay-timeout="2m0s"
I0426 20:52:31.025409       1 flags.go:52] FLAG: --node-group-auto-discovery="[]"
I0426 20:52:31.025414       1 flags.go:52] FLAG: --nodes="[1:10:CPX21:FSN1:worker]"
I0426 20:52:31.025420       1 flags.go:52] FLAG: --ok-total-unready-count="3"
I0426 20:52:31.025425       1 flags.go:52] FLAG: --regional="false"
I0426 20:52:31.025430       1 flags.go:52] FLAG: --scale-down-candidates-pool-min-count="50"
I0426 20:52:31.025435       1 flags.go:52] FLAG: --scale-down-candidates-pool-ratio="0.1"
I0426 20:52:31.025450       1 flags.go:52] FLAG: --scale-down-delay-after-add="10m0s"
I0426 20:52:31.025455       1 flags.go:52] FLAG: --scale-down-delay-after-delete="0s"
I0426 20:52:31.025460       1 flags.go:52] FLAG: --scale-down-delay-after-failure="3m0s"
I0426 20:52:31.025474       1 flags.go:52] FLAG: --scale-down-enabled="true"
I0426 20:52:31.025479       1 flags.go:52] FLAG: --scale-down-gpu-utilization-threshold="0.5"
I0426 20:52:31.025485       1 flags.go:52] FLAG: --scale-down-non-empty-candidates-count="30"
I0426 20:52:31.025490       1 flags.go:52] FLAG: --scale-down-unneeded-time="10m0s"
I0426 20:52:31.025504       1 flags.go:52] FLAG: --scale-down-unready-time="20m0s"
I0426 20:52:31.025509       1 flags.go:52] FLAG: --scale-down-utilization-threshold="0.5"
I0426 20:52:31.025524       1 flags.go:52] FLAG: --scale-up-from-zero="true"
I0426 20:52:31.025529       1 flags.go:52] FLAG: --scan-interval="10s"
I0426 20:52:31.025534       1 flags.go:52] FLAG: --skip-headers="false"
I0426 20:52:31.025548       1 flags.go:52] FLAG: --skip-log-headers="false"
I0426 20:52:31.025554       1 flags.go:52] FLAG: --skip-nodes-with-local-storage="true"
I0426 20:52:31.025558       1 flags.go:52] FLAG: --skip-nodes-with-system-pods="true"
I0426 20:52:31.025563       1 flags.go:52] FLAG: --stderrthreshold="0"
I0426 20:52:31.025568       1 flags.go:52] FLAG: --unremovable-node-recheck-timeout="5m0s"
I0426 20:52:31.025582       1 flags.go:52] FLAG: --v="4"
I0426 20:52:31.025588       1 flags.go:52] FLAG: --vmodule=""
I0426 20:52:31.025593       1 flags.go:52] FLAG: --write-status-configmap="true"
I0426 20:52:31.025609       1 main.go:371] Cluster Autoscaler 1.17.4
I0426 20:52:31.224400       1 leaderelection.go:242] attempting to acquire leader lease  kube-system/cluster-autoscaler...
I0426 20:52:31.240529       1 leaderelection.go:252] successfully acquired lease kube-system/cluster-autoscaler
I0426 20:52:31.241386       1 event.go:281] Event(v1.ObjectReference{Kind:"Lease", Namespace:"kube-system", Name:"cluster-autoscaler", UID:"b3af7b2f-51cf-435d-a98e-4a41f9bef763", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"11712370", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' cluster-autoscaler-768d84fdb8-sh84s became leader
I0426 20:52:31.422521       1 reflector.go:150] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:188
I0426 20:52:31.422576       1 reflector.go:185] Listing and watching *v1.Pod from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:188
I0426 20:52:31.423019       1 reflector.go:150] Starting reflector *v1.ReplicationController (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:319
I0426 20:52:31.423182       1 reflector.go:185] Listing and watching *v1.ReplicationController from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:319
I0426 20:52:31.423650       1 reflector.go:150] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:212
I0426 20:52:31.423747       1 reflector.go:185] Listing and watching *v1.Pod from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:212
I0426 20:52:31.423901       1 reflector.go:150] Starting reflector *v1.Job (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:328
I0426 20:52:31.424122       1 reflector.go:185] Listing and watching *v1.Job from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:328
I0426 20:52:31.424161       1 reflector.go:150] Starting reflector *v1.Node (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:246
I0426 20:52:31.424960       1 reflector.go:185] Listing and watching *v1.Node from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:246
I0426 20:52:31.424198       1 reflector.go:150] Starting reflector *v1.Node (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:246
I0426 20:52:31.425210       1 reflector.go:185] Listing and watching *v1.Node from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:246
I0426 20:52:31.424209       1 reflector.go:150] Starting reflector *v1beta1.PodDisruptionBudget (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:299
I0426 20:52:31.425593       1 reflector.go:185] Listing and watching *v1beta1.PodDisruptionBudget from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:299
I0426 20:52:31.424219       1 reflector.go:150] Starting reflector *v1.DaemonSet (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:310
I0426 20:52:31.424241       1 reflector.go:150] Starting reflector *v1.StatefulSet (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:346
I0426 20:52:31.424250       1 reflector.go:150] Starting reflector *v1.ReplicaSet (1h0m0s) from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:337
I0426 20:52:31.425809       1 reflector.go:185] Listing and watching *v1.DaemonSet from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:310
I0426 20:52:31.425820       1 reflector.go:185] Listing and watching *v1.StatefulSet from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:346
I0426 20:52:31.425828       1 reflector.go:185] Listing and watching *v1.ReplicaSet from k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:337
I0426 20:52:33.022916       1 factory.go:127] Creating scheduler from algorithm provider 'DefaultProvider'
I0426 20:52:33.022974       1 factory.go:219] Creating scheduler with fit predicates 'map[CheckNodeUnschedulable:{} CheckVolumeBinding:{} GeneralPredicates:{} MatchInterPodAffinity:{} MaxAzureDiskVolumeCount:{} MaxCSIVolumeCountPred:{} MaxEBSVolumeCount:{} MaxGCEPDVolumeCount:{} NoDiskConflict:{} NoVolumeZoneConflict:{} PodToleratesNodeTaints:{}]' and priority functions 'map[BalancedResourceAllocation:{} ImageLocalityPriority:{} InterPodAffinityPriority:{} LeastRequestedPriority:{} NodeAffinityPriority:{} NodePreferAvoidPodsPriority:{} SelectorSpreadPriority:{} TaintTolerationPriority:{}]'
I0426 20:52:33.322441       1 predicates.go:158] Using predicate PodFitsResources
I0426 20:52:33.322484       1 predicates.go:158] Using predicate PodToleratesNodeTaints
I0426 20:52:33.322491       1 predicates.go:158] Using predicate GeneralPredicates
I0426 20:52:33.322496       1 predicates.go:158] Using predicate ready
I0426 20:52:33.322503       1 predicates.go:158] Using predicate MatchInterPodAffinity
I0426 20:52:33.322508       1 predicates.go:158] Using predicate CheckNodeUnschedulable
I0426 20:52:33.322514       1 predicates.go:158] Using predicate MaxGCEPDVolumeCount
I0426 20:52:33.322519       1 predicates.go:158] Using predicate NoVolumeZoneConflict
I0426 20:52:33.322525       1 predicates.go:158] Using predicate MaxAzureDiskVolumeCount
I0426 20:52:33.322530       1 predicates.go:158] Using predicate MaxCSIVolumeCountPred
I0426 20:52:33.322535       1 predicates.go:158] Using predicate MaxEBSVolumeCount
I0426 20:52:33.322540       1 predicates.go:158] Using predicate NoDiskConflict
I0426 20:52:33.322546       1 predicates.go:158] Using predicate CheckVolumeBinding
I0426 20:52:33.322601       1 cloud_provider_builder.go:29] Building "hetzner" cloud provider.
F0426 20:52:33.322618       1 cloud_provider_builder.go:50] Unknown cloud provider: "hetzner"

Anything else we need to know?:

sergeyshevch commented 3 years ago

Do I need to build a custom image?

sergeyshevch commented 3 years ago

Fixed. It was incorrect configuration on my side