vmware-archive / kubernetes-archived

This repository is archived. Please file in-tree vSphere Cloud Provider issues at https://github.com/kubernetes/kubernetes/issues . CSI Driver for vSphere is available at https://github.com/kubernetes/cloud-provider-vsphere
Apache License 2.0
46 stars 31 forks source link

Kubernetes-Anywhere: vSphere cloudprovider provisions ip address based on DHCP #30

Open abrarshivani opened 7 years ago

abrarshivani commented 7 years ago

Currently, kubernetes-anywhere uses dynamic ip address for master and nodes. If node is restarted and ip address of node changes then is won't get recognized by master.

Deliverable:

kerneltime commented 7 years ago

Quesitons:

kerneltime commented 7 years ago

We need to pick a solution and make sure it is documented and works. The solution can be

  1. DHCP with IP addresses are managed and do not change
  2. Static IP for all nodes
  3. Static IP for master and DHCP for nodes that does not change. Also, it is more important to test existing functionality and remove need for credentials on nodes.
abrarshivani commented 7 years ago

DHCP:

  1. We need to have static ip for master, since kubelet (i.e node agent) is started with master's ip.
  2. For nodes, earlier kubernetes-master (api-server) when communicated with kubelet required to resolve node name therefore we would have to add dns entries in /etc/host. Now, it communicates via IP address so we don't need any DNS entries in /etc/hosts. Referred this.

Testing: I modified kubernetes-anywhere scripts so that dns entries in /etc/hosts are not written and deployed kubernetes(v1.4.7) cluster. Nodes configured by kubernetes-anywhere gets ip address via DHCP. I created vmdk using storage class and created pod successfully using this vmdk. Later, I deleted pod and removed contents from /etc/machine-id so that node gets different ip address after reboot. After reboot, I was able to create and delete pod with vsphere volume successfully.

Following issues I encountered while testing:

  1. After reboot, docker didn't picked up new flannel subnet. This could be due to docker service starting before docker env file is updated.
  2. kubectl logs and kubectl exec failed with following error,
    root@photon-zv1KbtvMG [ ~ ]# kubectl get pods
    NAME      READY     STATUS    RESTARTS   AGE
    pvscpod   1/1       Running   0          1m
    root@photon-zv1KbtvMG [ ~ ]# kubectl exec -it pvscpod /bin/sh
    Error from server: dial tcp: lookup node1 on 10.162.204.1:53: no such host

    This is resolved in v1.5.0-beta3. Look here.

abrarshivani commented 7 years ago

In k8s v1.4.7, there is an issue that kubectl cannot connect to pod for logs, this is resolved in 1.5.0-beta3, so I would suggest that we remove the code that adds DNS entries in /etc/host and add support for static ip for master in k8s-anywhere when we support 1.5.0.

abrarshivani commented 7 years ago

I removed the entries from /etc/hosts/ and launch kubernetes cluster v1.5.3, Attach/Detach volume was successfull. Yet, kubectl was not able to connect pod and failed with following error

an error on the server ("unknown") has prevented the request from succeeding (get pods kubernetes-dashboard-1872455951-t3k96)

Later, I added dns entries and it worked fine. Hence, kubernetes still needs dns to lookup for node to connect to pod. Also, adding static ip to master using terraform fails since gateway is not added. I have created issue here.

abrarshivani commented 7 years ago

I removed the dns entries and launched kubernetes cluster v1.5-release and ran into following error

Error from server: Get https://node1:10250/containerLogs/kube-system/kubernetes-dashboard-1872455951-wfhdd/kubernetes-dashboard: dial tcp: lookup node1 on 10.162.204.1:53: no such host

The solution for this is to either use DNS server or change node name in vSphere Cloud Provider to ip address. This is expected to be resolved in 1.5 release but it is not working.