coreos / coreos-kubernetes

CoreOS Container Linux+Kubernetes documentation & Vagrant installers
https://coreos.com/kubernetes/docs/latest/
Apache License 2.0
1.1k stars 465 forks source link

why is docker.service not automatically restarting #682

Open itajaja opened 8 years ago

itajaja commented 8 years ago

https://github.com/coreos/coreos-kubernetes/blob/master/multi-node/aws/pkg/config/templates/cloud-config-worker#L9

here it seems like docker service doesn't have a restart policy. I am sure I am missing something, and I apoligize if this isn't the best place to discuss this, but I did experience some problem in my cluster when, for any reason, docker died, and I had to automatically restart it.

spacepluk commented 8 years ago

It's probably because of this: https://github.com/systemd/systemd/issues/1312

I'm using this workaround and it's been working fine so far: https://gist.github.com/spacepluk/a14f10cfed3756c0f1f079e73cdc6c9a

itajaja commented 8 years ago

@spacepluk why do you consider it a workaround? Are there reasons why that patch couldn't be pushed upstream?

(also, update your link to remove the /edit otherwise it 404s)

spacepluk commented 8 years ago

Oops, fixed the link.

I don't know the details to be honest, it looks like some change of behavior in systemd caused the issue. I believe @colhom is working on a proper solution for coreos-kubernetes.

cgag commented 8 years ago

@itajaja we're discussing a similar issue here: https://github.com/coreos/coreos-kubernetes/issues/675. I think there's a good chance we'll end up doing @spacepluk did for all the normal services (the discussion in 675 is around a oneshot service, and those are a little weird).

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it. Maybe systemd stops activating it if it failed due a dependency issue? I'll try to test that out later, but I agree we should probably just add restart logic regardless.

itajaja commented 8 years ago

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it.

You are right, good point.