wbuchwalter / Kubernetes-acs-engine-autoscaler

[Deprecated] Node-level autoscaler for Kubernetes clusters created with acs-engine.
Other
71 stars 22 forks source link

New VMs don't connect to the cluster #49

Closed oryagel closed 7 years ago

oryagel commented 7 years ago

Hi, The autoscaler work as expected and add VMs when needed. We can see in the Azure portal that the autoscaler create new deployment. The problem is that the new VMs do not appear in the cluster. What can be the issue?

This is our configuration

kind: Deployment
metadata:
  name: autoscaler
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: autoscaler
        openai/do-not-drain: "true"
    spec:
      containers:
      - name: autoscaler
        image: wbuchwalter/kubernetes-acs-engine-autoscaler:latest
        env:
        - name: AZURE_SP_APP_ID
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: azure-sp-app-id
        - name: AZURE_SP_SECRET
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: azure-sp-secret
        - name: AZURE_SP_TENANT_ID
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: azure-sp-tenant-id    
        - name: KUBECONFIG_PRIVATE_KEY
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: kubeconfig-private-key
        - name: CLIENT_PRIVATE_KEY
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: client-private-key
        - name: CA_PRIVATE_KEY
          valueFrom:
            secretKeyRef:
              name: autoscaler
              key: ca-private-key
        - name: SLACK_HOOK
          value: https://hooks.slack.com/services/XXX
        - name: SLACK_BOT_TOKEN
          value: xxx
        command:
            - python
            - main.py
            - --resource-group 
            - SOME_NAME
            - --acs-deployment
            - SOME_NAME
            - --idle-threshold
            - "120"
            - -vvv
            - --spare-agents
            - "4"
        imagePullPolicy: Always
      restartPolicy: Always
      dnsPolicy: Default  # Don't use cluster DNS.
wbuchwalter commented 7 years ago

Hello, This is usually caused by an issue with the credentials you provided. I'm guessing this is an issue with your secret. Did you correctly encode everything in base64? Even though the different private keys are already base64 encoded, you need to encode them once more.

oryagel commented 7 years ago

That was it, double base64 encoding.

Thanks

omerlh commented 6 years ago

I now encountered a similar issue, and I double-check that the secrets are base64 encoded twice. After connecting to the VM that does not connect to the cluste I can see the following in syslog:

docker[7186]: I1220 15:20:06.122771    7694 feature_gate.go:144] feature gates: map[Accelerators:true]
docker[7186]: I1220 15:20:06.124061    7694 azure.go:174] azure: using client_id+client_secret to retrieve access token
docker[7186]: I1220 15:20:06.124209    7694 server.go:439] Successfully initialized cloud provider: "azure" from the config file: "/etc/kubernetes/azure.json"
docker[7186]: I1220 15:20:06.124235    7694 server.go:740] cloud provider determined current node name to be <>
docker[7186]: W1220 15:20:06.126634    7694 server.go:474] New kubeClient from clientConfig error: tls: private key does not match public key
docker[7186]: W1220 15:20:06.127923    7694 server.go:482] New kubeClient from clientConfig error: tls: private key does not match public key
docker[7186]: W1220 15:20:06.129229    7694 server.go:491] Failed to create API Server client: tls: private key does not match public key
docker[7186]: W1220 15:20:06.130440    7694 server.go:500] Failed to create API Server client for heartbeat: tls: private key does not match public key
kernel: [  116.725179] eth0: renamed from vethedfcdec

So it look like the issue is with the private key - I think I took the correct one from the deployment.json.parameters. What can I check?

omerlh commented 6 years ago

It turned out the issue was how I created the base64 encoded string. I used echo <> | base64 instead of echo -n <> | base64. Without -n, it is also encoding the new line.