Closed camilb closed 7 years ago
Easy mistake to slip in..
I think i've spotted another one:
ExecStop=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
/hyperkube kubectl \
--server=https://{{.ExternalDNSName}}:443 \
--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
drain $$(hostname) \
--ignore-daemonsets \
--force'
Unless i'm missing something here, the double $$
seems to be an error. Is this some escaping trick? I think we should just be able to run $(hostname)
, right?
@pieterlange I see that it's working with double $$
as with one $
.
Related to this, but looks like it's affecting other services too, using ExecStartPre=/usr/bin/systemctl is-active kubelet.service
will check if the service is active then exit (code=exited, status=3). This being a oneshot
service, it will not be restarted. Also observed this on install-calico-system.service
and install-kube-system.service.
Using ExecStartPre=/usr/bin/systemctl is-active service.name
it will generate a lot of errors before kubelet.service
becomes active.
In this case I think is better to use something like:
After=multi-user.target
[Install]
WantedBy=node-drain.target
This way we make sure that all the services are running before we start this one.
Here is a proposal:
[Unit]
Description=drain this k8s node to make running pods time to gracefully shut down before stopping kubelet
After=multi-user.target
Wants=decrypt-tls-assets.service kubelet.service docker.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
/hyperkube kubectl \
--server=https://{{.ExternalDNSName}}:443 \
--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
uncordon $(hostname)
ExecStop=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
/hyperkube kubectl \
--server=https://{{.ExternalDNSName}}:443 \
--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
drain $(hostname) \
--ignore-daemonsets \
--force'
[Install]
WantedBy=node-drain.target
@pieterlange replied to you in https://github.com/coreos/kube-aws/pull/41#issuecomment-259003781 about $$
@camilb Thanks for your feedback and proposal!
I'm not intended to just stick with the is-active
method but anyways it seems that we may have 3 issues here:
kubelet
, flanneld
, etc.For 1, I believe we can use RestartSec
as used in https://github.com/coreos/coreos-baremetal/blob/master/examples/ignition/bootkube-controller.yaml#L66 to alleviate the issue. Should we start from RestartSec=10
anyway?
For 2, a part of issue is resolved thanks to your pr #41. To tackle the other part of issue, I began to believe your proposal, that uses After=multi-user.target
to control order of service startup, is the only way to go!
For 3, though I'm rather looking forward with it, I'm not familiar with its use-case!
Added to the known-issues list https://github.com/coreos/kube-aws/releases/tag/v0.9.1-rc.1
Uncordoning will be necessary when the node was restarted due to manual operator intervention instead of a rolling upgrade. Not sure how often we'd see that in practice (who reboots nodes in an ASG?) but it might cause some 'funny' side-effects.
@mumoshu
RestartSec
and also try to set a order for services to see which it's more effective. After=multi-user.target
and works fine.ExecStart=/bin/true
in this case. @pieterlange Actually had to reboot the nodes several time on a staging cluster due the Docker 10.3 problems on overload. I know that you can lose the node in an ASG but it's faster. @camilb Thanks for your cooperation here.
Just doing a quick reply to 1., fyi, we've already been hit by the 51200 bytes limit in the master branch. If you are going to start testing on top of it, I'd like to encourage you to try https://github.com/coreos/kube-aws/pull/45 as the base branch for testing!
@mumoshu Thanks, I will try it. Hitted the limit several times and was using kube-aws up --export
with S3.
@mumoshu Finished testing.
kube-aws update --s3-uri s3://my_bucket/kube-aws always shows this error:
Error: Error updating cluster: error updating cloudformation stack: ValidationError: Template format error: unsupported structure.
status code: 400, request id: 0d04f84f-a607-11e6-b3c5-2548ad1d19cd
But works fine with aws cloudformation update-stack
.
kube-node-drainer.service
docker.service
it's stopped at the same time with kube-node-drainer.service
Less than 10% of times it worked. I did try several configurations like Wanted-by or Required-by=poweroff.target reboot.target halt.target etc...
Then swithced to rkt and after few intents I found a solution that works fine with rolling-updates, manual shutdown, reboot, etc. Tested several times and it din't fail.
- name: kube-node-drainer.service
enable: true
command: start
runtime: true
content: |
[Unit]
Description=drain this k8s node to make running pods time to gracefully shut down before stopping kubelet
After=multi-user.target
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
TimeoutStopSec=30s
ExecStop=/bin/sh -c '/usr/bin/rkt run \
--volume=kube,kind=host,source=/etc/kubernetes,readOnly=true \
--mount=volume=kube,target=/etc/kubernetes \
--net=host \
quay.io/coreos/hyperkube:v1.4.5_coreos.0 \
--exec=/kubectl -- \
--server=https://{{.ExternalDNSName}}:443 \
--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
drain $(hostname) \
--ignore-daemonsets \
--force'
[Install]
WantedBy=multi-user.target
@camilb
kube-aws up --s3-uri
worked for you!FYI, I have just merged #48 which, in combination with #41, fixes what is stated in the title of this issue.
Revisiting & thinking about @pieterlange's comment at https://github.com/coreos/kube-aws/issues/40#issuecomment-259049021 and @camilb's comment at https://github.com/coreos/kube-aws/issues/40#issuecomment-259056123.
In addition to what you've mentioned, would the uncordon feature + the node drainer allow us to automatically upgrade CoreOS version, which implies automatic rebooting, hopefully without downtime? If that is the case, as there're several possible use-cases already, I thought it would be nice to keep discussing about it in an another issue.
@mumoshu I will close this for now. Still not perfect but works better now. The request to drain the node is properly sent and the pods are started on other nodes. The problem is the containers are quickly killed on the drained node and for some pods that are using bigger images or need a longer time to start/stop, there is not enough time to be started on other nodes. I'm looking for a good method to delay stopping some services on shutdown or reboot. Saw some examples on Redhat and want to test them. I will open another issue with a proposal for improvements soon.
I think the line
Restart=on-failure
fromkube-node-drainer.service
should be removed.Getting these errors:
Failed to restart kubelet.service: Unit kube-node-drainer.service is not loaded properly: Invalid argument. kube-node-drainer.service: Service has Restart= setting other than no, which isn't allowed for Type=oneshot services. Refusing.