Closed anton-johansson closed 5 years ago
Testing right now, noticed an issue. With serial: 1
, facts aren't gathered for all hosts, which is necessary for etcd
(and maybe more). Looking into it.
Edit: Solved by forcing facts gathering before running etcd
tasks, as per 75785dedc6f7386744e80455c5256b9fc160c577.
I've tested running the playbook on an existing cluster without upgrading, and it looks like it does what it's supposed to. One task on one hosts at a time.
I'll probably test this again when 1.15.0
is released (or if I'm bold enough to try a release candidate).
Would be nice to get the containerd
up and running. Any ideas for how we can do that, @amimof? Is #47 an alternative?
@anton-johansson I'm not sure if this is the right approach. To run all tasks in serial. The serial property is not supported in Ansible on a per-task basis, which means that all tasks under a play would run in serial. In large clusters that would increase the execution time significantly.
I don't believe that it's necessary to install masters with a rolling strategy. Maybe that even applies to the kubelet and kube-proxy as well, and only containerd should be restarted one host at a time. It all depends on the type of downtime. Is it workload or APIs?
What is your experience of upgrading a production cluster with KTRW?
Hmm, okay.
I did lose connectivity to my services during a period when I upgraded from 1.13 to 1.14. I'm not sure what caused it, but my guess was kube-proxy
, but it could of course be containerd
. I don't see how though, because containers don't die just because containerd
dies, right? kube-proxy
however, handles the network proxying to the correct container, which feels super-dangerous if they all are restarted at the same time?
I understand that masters don't need to be in serial, but I would still feel more safe if I could reach my API server during the upgrade, which I can't if I restart them all at the same time.
which means that all tasks under a play would run in serial
This is pretty much what I wanted to accomplish. :) Do you know if this setting is controlled with a variable somehow? I guess it would be nice if it would default to 100%, but you could set it using some parameter. I'd set it to 1
(or maybe 50%
, which would effectively be 1/3 or 2/5, etc) in my case.
Or maybe a constant 50%
would suffice? It would of course increase the execution time for large clusters, but it wouldn't be that bad with 50%
?
@anton-johansson Yes kube-proxy
downtime will cause network connectivity to containers to be interrupted. And i'm pretty sure that containers are restarted when containerd
is restarted. Here is an example of how we could restart a service per host without use of serial
. But I still haven't figured out how to check if the service is up in order to move on to the next host.
- name: Start and enable kubelet
systemd:
name: kubelet
state: restarted
enabled: True
with_items: "{{ groups['nodes'] }}"
run_once: True
delegate_to: "{{ item }}"
Another solution would be to restart everything at the very end, putting a section in install.yml
. Something like below. However this would make the individual roles a bit less independent in my opinion. So my idea is to pass in a state
variable to the roles that can controll if the component should be restarted after it is installed. The default would be true
since we want the roles to be self-contained. So the resulting install.yml
could look something like this:
- hosts: localhost
connection: local
roles:
- certificates
- hosts: etcd
roles:
- { role: etcd, vars: { restart: false }}
- hosts: masters
roles:
- { role: kube-apiserver, vars: { restart: false }}
- { role: kube-controller-manager, vars: { restart: false }}
- { role: kube-scheduler, vars: { restart: false }}
- hosts: nodes
roles:
- cni
- { role: containerd, vars: { restart: false }}
- runc
- { role: kube-proxy, vars: { restart: false }}
- { role: kubelet, vars: { restart: false }}
# Restart components in serial
- hosts: nodes
serial: 1
roles:
- { role: containerd, vars: { restart: true }}
- { role: kube-proxy, vars: { restart: true }}
- { role: kubelet, vars: { restart: true }}
- hosts: etcd
serial: 1
roles:
- { role: etcd, vars: { restart: true }}
- hosts: masters
serial: 1
roles:
- { role: containerd, vars: { restart: true }}
- { role: kube-proxy, vars: { restart: true }}
- { role: kubelet, vars: { restart: true }}
I like the last suggestion, where we only restart the services in serial.
But wouldn't we need two variables? One that indicates that we should prepare everything (download binaries, update configuration files, etc) and one that indicates that the service should be restarted. Both could default to true
so the role is independant and can be executed standalone. But in the install
playbook, first part (without serial) would only prepare, while the last part (in serial) would only restart (and wait for service to come back up)?
EDIT: Good to know that stopping containerd
stops containers, I wasn't aware of that.
Yeah you are right, we need two variables. state: present|absent
and restart: true|false
I can see if I can get something like that up and running.
@anton-johansson So i've been doing some experimenting with my initial suggestion where I proposed putting all restarts in the end of the playbook. I have come to the conclusion that this is an anti-pattern in terms of Ansible best practices. This is due to how everything evolves around Plays and not Playbooks.
The "best" possible implementation (Ansible-wise) is to use Handlers. But handlers alone doesn't solve the sequential restart problem. However by combining handlers with Rolling Updates (serial) we can achieve what we wan't in a good way.
So I propose that we make sure that users may set the serial
value to all plays (with a default). That way it would be up to the user to determine how updates are applied. For example in large clusters (>50 nodes) it may be perfectly fine to have 2 or more nodes down for maintenance which would translate to serial: 2
. What is cool is that Ansible supports passing a percentage value to serial. So one could set serial: 5%
.
I've opened a PR that adds restart-handlers to all appropriate roles. Have a look and let me know what you think
Yeha, that sounds good. What would be a good default for this value though? I mentioned earlier in this PR that if we use 50%
as the default, it would effectively be 1/3 or 2/5, etc, which feels like a good default for high availability.
Or would you want it to be 100% by default, indicating pure parallellism?
We can move the discussion over to #52. Closing this!
My mistake, this PR is not really related to #52. Re-opening and see if I can make the appropriate changes.
I've made some changes @amimof, but I have yet to test this thoroughly. I'll remove [WIP]
from the title when tested.
It appears that host variables cannot be used in the scope of the playbook, only tasks within the playbook.
See ansible/ansible#18131
How about using --extra-vars
?
For example:
$ ansible-playbook --inventory ansible-inventory --extra-vars "serial_masters=50%" ~/projects/kubernetes-the-right-way/install.yml
I've made some changes to utilize the command line argument --extra-vars
. This works well. As you can see, I had to gather facts for all etcd
hosts, because when using serial
it wont gather facts for all of them immediately.
I think this needs to be done for kube-apiserver
as well, if we want plays to be independant of other plays. Could you confirm this?
Also, this means that we'll run the facts gathering even if we don't need to (i.e. if we don't use serial
). Do you know if there's a way to prevent this? Maybe use when
? Something like:
when: hostvars[item]['ansible_hostname'] is not defined
I've made some changes to utilize the command line argument
--extra-vars
. This works well. As you can see, I had to gather facts for alletcd
hosts, because when usingserial
it wont gather facts for all of them immediately.I think this needs to be done for
kube-apiserver
as well, if we want plays to be independant of other plays. Could you confirm this?Also, this means that we'll run the facts gathering even if we don't need to (i.e. if we don't use
serial
). Do you know if there's a way to prevent this? Maybe usewhen
? Something like:when: hostvars[item]['ansible_hostname'] is not defined
Nice job! I also encountered that serial_*
needs to passed as an extra variable but I think it's fine for now. I can confirm that gathering facts is also required for kube-apiserver. See: https://github.com/amimof/kubernetes-the-right-way/blob/1499f869146507585fcf9331aba35dc59a75e698/roles/kube-apiserver/templates/kube-apiserver.service.j2#L21
Also I don't mind gathering facts even if it isn't needed. It should be possible with a simple when
but this needs to be tested.
Fixed kube-apiserver
.
Also squashed together some commits to avoid redundant messages. Kept some commit separate because they're actually informational.
@anton-johansson It might be a good idea to also make use of serial
in the CI pipeline by passing -e "serial_all=50%"
to https://github.com/amimof/kubernetes-the-right-way/blob/1499f869146507585fcf9331aba35dc59a75e698/.travis.yml#L22
Other than that LGTM and should be good to merge
Good idea! Done, lets see what Travis says. :)
This way, we won't stop for example the API server or
kube-proxy
on all masters or nodes at the same time, causing downtime.To do:
containerd
to start (see #47)1.14.3
or1.15.0-rc.1
)Discussions: