Closed tonyppe closed 4 years ago
I was able to get the service started on the master but then later on the install fails again with an error:
{ "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/projects/_17__infrastructure_spinnaker/roles/geerlingguy.kubernetes/tasks/node-setup.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Join node to Kubernetes master\n ^ here\n" }
I had this issue today too but I was running separately the installation in the master and the nodes. When I run all together from local, the node installation waits for the master to finish and the output of the join command is then available for the nodes. Basically this task:
# Set up nodes.
- name: Get the kubeadm join command from the Kubernetes master.
shell: kubeadm token create --print-join-command
changed_when: False
when: kubernetes_role == 'master'
run_once: True
register: kubernetes_join_command
Check if that's your case too.
Yeah, that is one assumption I've made that can trip people up; I'm assuming this role is always being run from a host outside the infra, e.g. my laptop runs it, sets up the master node, then sets up the other nodes, all in one go. If you just run it against one node or from the nodes themselves, it will definitely fail.
For a canonical example (there's also a Vagrantfile that can be used for local testing), see: https://github.com/geerlingguy/raspberry-pi-dramble/tree/kubernetes (work in progress)
I'm going to try this again and use a single instance for running the master and worker. When I raised this I was trying to do the master and two workers in the one deployment from ansible (actually ansible tower). The way in which I have configured this to run is that ansible tower runs the ansible-playbook command and this then ssh into the master node, runs the install and then does the workers.
I'm about ready to run this again onto a single instance just now. I was delayed because I was caught up reading this https://www.jeffgeerling.com/blog/2018/kubernetes-complexity , specifically about the compromised docker hub images. I tried to find some known infected docker hub images so that I could paste their direct URL into this tool: https://virustotal.com/ . I want to see if this tool can alert for malicious code within before someone downloads and runs the image.
Install goes without issue using an all-in-one instance deployment. Cheers
Possibly related: #10
In my case, it was because of kubelet
failed to start due to:
F1030 11:01:13.231850 8210 server.go:273] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "system" is different from docker cgroup driver: "systemd"
Fixed it by adding var:
kubernetes_kubelet_extra_args: '--cgroup-driver=systemd'
This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!
Please read this blog post to see the reasons why I mark issues as stale.
Closing as the typical fix is to check your kubelet logs on the machine where there are issues and fix those issues. I think there are enough hints in this issue to help anyone else with similar problems.
Hi there, I tried to deploy this onto 3 x ubuntu 16 instances but the result is that the kubernetes api and other services do not start. The error is actually at the point where it tries to download and import the yaml, because it cannot connect to port 6443 (because the service is not starting).
so I destroyed those 3 instances and then deployed centos 7. Again, install fails, I am trying to debug. Are you aware of these issues? Any suggestions? Both ubuntu and centos are vanilla and are running on openstack.