chris-short / rak8s

Stand up a Raspberry Pi based Kubernetes cluster with Ansible
MIT License
365 stars 112 forks source link

Add troubleshooting steps and timing adjustments #37

Closed tomtom215 closed 6 years ago

tomtom215 commented 6 years ago

Description

Added some info on how to restart this process if you accidentally corrupt your cluster by running apt-update or other situations. Also adjusted the timeout parameters to allow for longer times for nodes to reboot and join.

Testing

5 node cluster with 1 master node. I was having consistent timeout issues which would cause one of the nodes to fail to reboot in time and would be passed over by the rest of the playbook.

After this failure, I re-ran the playbook and caused my cluster to become unavailable due to existing Docker and Kubeadm installations being updated to incompatible versions by the initial steps of the playbook.

I also tested running the "Troubleshooting" steps I added once I had issues and re-ran the playbook with adjusted timings successfully