Kubernetes-Anywhere: vSphere cloudprovider provisions ip address based on DHCP

abrarshivani commented 7 years ago

Currently, kubernetes-anywhere uses dynamic ip address for master and nodes. If node is restarted and ip address of node changes then is won't get recognized by master.

Deliverable:

[ ] Either make static IPs work or make DHCP work end to end or document limitation (node gets IP from DHCP but network admin must assign fixed IPs via DHCP)

kerneltime commented 7 years ago

Quesitons:

[ ] Why is instance ID important
[ ] Why do we need VM name on node
[ ] Can the master resolve both node name and public IP for a node (Node pass additional info when it registers)
[ ] What do other providers do for node IP management and hostoverride

kerneltime commented 7 years ago

We need to pick a solution and make sure it is documented and works. The solution can be

DHCP with IP addresses are managed and do not change
Static IP for all nodes
Static IP for master and DHCP for nodes that does not change. Also, it is more important to test existing functionality and remove need for credentials on nodes.

abrarshivani commented 7 years ago

DHCP:

We need to have static ip for master, since kubelet (i.e node agent) is started with master's ip.
For nodes, earlier kubernetes-master (api-server) when communicated with kubelet required to resolve node name therefore we would have to add dns entries in /etc/host. Now, it communicates via IP address so we don't need any DNS entries in /etc/hosts. Referred this.

Testing: I modified kubernetes-anywhere scripts so that dns entries in /etc/hosts are not written and deployed kubernetes(v1.4.7) cluster. Nodes configured by kubernetes-anywhere gets ip address via DHCP. I created vmdk using storage class and created pod successfully using this vmdk. Later, I deleted pod and removed contents from /etc/machine-id so that node gets different ip address after reboot. After reboot, I was able to create and delete pod with vsphere volume successfully.

Following issues I encountered while testing:

After reboot, docker didn't picked up new flannel subnet. This could be due to docker service starting before docker env file is updated.

kubectl logs and kubectl exec failed with following error,

root@photon-zv1KbtvMG [ ~ ]# kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
pvscpod   1/1       Running   0          1m
root@photon-zv1KbtvMG [ ~ ]# kubectl exec -it pvscpod /bin/sh
Error from server: dial tcp: lookup node1 on 10.162.204.1:53: no such host

This is resolved in v1.5.0-beta3. Look here.

abrarshivani commented 7 years ago

In k8s v1.4.7, there is an issue that kubectl cannot connect to pod for logs, this is resolved in 1.5.0-beta3, so I would suggest that we remove the code that adds DNS entries in /etc/host and add support for static ip for master in k8s-anywhere when we support 1.5.0.

abrarshivani commented 7 years ago

I removed the entries from /etc/hosts/ and launch kubernetes cluster v1.5.3, Attach/Detach volume was successfull. Yet, kubectl was not able to connect pod and failed with following error

an error on the server ("unknown") has prevented the request from succeeding (get pods kubernetes-dashboard-1872455951-t3k96)

Later, I added dns entries and it worked fine. Hence, kubernetes still needs dns to lookup for node to connect to pod. Also, adding static ip to master using terraform fails since gateway is not added. I have created issue here.

abrarshivani commented 7 years ago

I removed the dns entries and launched kubernetes cluster v1.5-release and ran into following error

Error from server: Get https://node1:10250/containerLogs/kube-system/kubernetes-dashboard-1872455951-wfhdd/kubernetes-dashboard: dial tcp: lookup node1 on 10.162.204.1:53: no such host

The solution for this is to either use DNS server or change node name in vSphere Cloud Provider to ip address. This is expected to be resolved in 1.5 release but it is not working.

vmware-archive / kubernetes-archived

Kubernetes-Anywhere: vSphere cloudprovider provisions ip address based on DHCP #30