kube controller manager - configuration being ignored

stephanlindauer commented 7 years ago

following the documentation @ https://coreos.com/kubernetes/docs/latest/deploy-master.html i setup my kube controller manager manifest like this:

  - path: "/etc/kubernetes/manifests/kube-controller-manager.yaml"
    content: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: kube-controller-manager
        namespace: kube-system
      spec:
        hostNetwork: true
        containers:
        - name: kube-controller-manager
          image: quay.io/coreos/hyperkube:v1.4.3_coreos.0
          command:
          - /hyperkube
          - controller-manager
          - --master=http://127.0.0.1:8080
          - --leader-elect=true
          - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
          - --root-ca-file=/etc/kubernetes/ssl/ca.pem
          livenessProbe:
            httpGet:
              host: 127.0.0.1
              path: /healthz
              port: 10252
            initialDelaySeconds: 15
            timeoutSeconds: 1
          volumeMounts:
          - mountPath: /etc/kubernetes/ssl
            name: ssl-certs-kubernetes
            readOnly: true
          - mountPath: /etc/ssl/certs
            name: ssl-certs-host
            readOnly: true
        volumes:
        - hostPath:
            path: /etc/kubernetes/ssl
          name: ssl-certs-kubernetes
        - hostPath:
            path: /usr/share/ca-certificates
          name: ssl-certs-host

i get weird errors like

controllermanager.go:232] Unsuccessful parsing of cluster CIDR : invalid CIDR address:

or

controllermanager.go:474] Failed to start certificate controller: open /etc/kubernetes/ca/ca.pem: no such file or directory

or

 dial tcp 127.0.0.1:8080: getsockopt: connection refused

the whole thing eventually fails with

leaderelection.go:252] error retrieving endpoint: client: etcd cluster is unavailable or misconfigured

and

leaderelection.go:232] failed to renew lease kube-system/kube-scheduler
server.go:156] lost master

which is also not true because i can etcdctl mkdir foobar and etcdctl ls on all servers and it works.

this makes my doubt that the container picked up my configurations at all.

you can find my setup @ https://github.com/stephanlindauer/terra-aws-core-kube

stephanlindauer commented 7 years ago

kube-dns seems to struggle as well. when i run

kubectl exec busybox -- traceroute 10.0.0.2

i get

traceroute to 10.0.0.2 (10.0.0.2), 30 hops max, 46 byte packets
 1  10.0.1.10 (10.0.1.10)  0.005 ms  0.004 ms  0.002 ms
 2  *  *  *
 3  *  *  *
 4  *  *  *
 5  *  *  *
 6  *  *  *
 7  *  *  *
 8  *  *  *
 9  *  *  *
10  *  *  *
11  *  *  *
12  *  *  *
13  *  *  *
14  *  *  *
15  *  *  *
16  *  *  *
17  *  *  *
18  *  *  *
19  *  *  *
20  *  *  *
21  *  *  *
22  *  *  *
23  *  *  *
24  *  *  *
25  *  *  *
26  *  *  *
27  *  *  *
28  *  *  *
29  *  *  *
30  *  *  *

so it looks like it can't find the aws instance nameserver.

Bekt commented 7 years ago

I have a very similar set up as you (looking at the latest code on your repo) -- did you get kube-dns working?

stephanlindauer commented 7 years ago

yepp. just have a look at https://github.com/stephanlindauer/terra-aws-core-kube the trick is to wait for the master to be up before starting the worker kubelet.

awebneck commented 7 years ago

Also seeing this issue

stephanlindauer commented 7 years ago

just make sure to run

      #!/bin/bash
      until curl -o /dev/null -sIf --cacert /etc/kubernetes/ssl/ca.pem --cert /etc/kubernetes/ssl/worker.pem --key /etc/kubernetes/ssl/worker-key.pem https://${MASTER_HOST}/; do \
        sleep 1 && echo -n .;
      done;

before starting your worker kubelets.

coreos / coreos-kubernetes

kube controller manager - configuration being ignored #786