Open itajaja opened 8 years ago
It's probably because of this: https://github.com/systemd/systemd/issues/1312
I'm using this workaround and it's been working fine so far: https://gist.github.com/spacepluk/a14f10cfed3756c0f1f079e73cdc6c9a
@spacepluk why do you consider it a workaround? Are there reasons why that patch couldn't be pushed upstream?
(also, update your link to remove the /edit
otherwise it 404s)
Oops, fixed the link.
I don't know the details to be honest, it looks like some change of behavior in systemd caused the issue. I believe @colhom is working on a proper solution for coreos-kubernetes.
@itajaja we're discussing a similar issue here: https://github.com/coreos/coreos-kubernetes/issues/675. I think there's a good chance we'll end up doing @spacepluk did for all the normal services (the discussion in 675 is around a oneshot service, and those are a little weird).
I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it. Maybe systemd stops activating it if it failed due a dependency issue? I'll try to test that out later, but I agree we should probably just add restart logic regardless.
I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it.
You are right, good point.
https://github.com/coreos/coreos-kubernetes/blob/master/multi-node/aws/pkg/config/templates/cloud-config-worker#L9
here it seems like docker service doesn't have a restart policy. I am sure I am missing something, and I apoligize if this isn't the best place to discuss this, but I did experience some problem in my cluster when, for any reason, docker died, and I had to automatically restart it.